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Abstract.  Tools  for  computational  differentiation  transform  a  program  that  com¬ 
putes  a  numerical  function  F(x )  into  a  related  program  that  computes  F' (x)  (the 
derivative  of  F).  This  paper  describes  how  techniques  similar  to  those  used  in 
computational-differentiation  tools  can  be  used  to  implement  other  program  transformations — i 
in  particular,  a  variety  of  transformations  for  computational  divided  differencing .  The 
specific  technical  contributions  of  the  paper  are  as  follows: 

—  It  presents  a  program  transformation  that,  given  a  numerical  function  F(x) 
defined  by  a  program,  creates  a  program  that  computes  F[x o,jci],  the  first 
divided  difference  of  F( x),  where 


F[x0,  xi\ 


F(xq)-F(x1) 

XQ-X! 

j^F(z),  evaluated  at  z  =  xo 


if  xo  ^  xi 
if  xo  =  xi 


—  It  shows  how  computational  first  divided  differencing  generalizes  computa- 
tional  differentiation. 

—  It  presents  a  second  program  transformation  that  permits  the  creation  of 
higher-order  divided  differences  of  a  numerical  function  defined  by  a  program. 

—  It  shows  how  to  extend  these  techniques  to  handle  functions  of  several  variables. 

The  paper  also  discusses  how  computational  divided-differencing  techniques  could 
lead  to  faster  and/or  more  robust  programs  in  scientific  and  graphics  applications. 

Finally,  the  paper  describes  how  computational  divided  differencing  relates  to 
the  numerical-finite-differencing  techniques  that  motivated  Robert  Paige’s  work  on 
finite  differencing  of  set- valued  expressions  in  SETL  programs. 
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1.  Introduction 

A  variety  of  studies  in  the  field  of  programming  languages  have  led  to 
useful,  high-level  transformations  that  manipulate  programs  in  seman¬ 
tically  meaningful  ways.  In  very  general  terms,  these  tools  transform 
a  program  that  performs  a  computation  F(x)  into  a  program  that 
performs  a  related  computation  F$(x),  for  a  variety  of  F^s  of  interest.1 
(In  some  cases,  an  appropriate  preprocessing  operation  h  needs  to  be 
applied  to  the  input;  in  such  cases,  the  transformed  program  F:  is  used 
to  perform  a  computation  of  the  form  F$(h(x)).)  Examples  of  such 
tools  include  partial  evaluators  and  program,  slicers: 

—  A  partial  evaluator  creates  a  specialized  version  of  a  program 
when  only  part  of  the  program’s  input  has  been  supplied  (Fu- 
tamura,  1971;  Futamura,  1999;  Bjprner  et  al.,  1988;  Jones  et  al., 
1993).  A  partial  evaluator  transforms  a  program  that  performs 
a  computation  i7’((s,d)),  where  F  operates  on  a  pair  of  inputs 
(s.  d):  when  s  is  known,  partial  evaluation  of  the  program  with 
respect  to  s  results  in  a  program  that  computes  the  function  Fs(d ) 
(=  _F5(second((s, d)))),  such  that 

Fs(d)=F({s,d)).  (1) 

(The  mnemonic  is  that  s  and  d  stand  for  the  “static”  and  “dy¬ 
namic”  parts  of  the  input,  respectively.) 

Partial  evaluation  is  useful  for  removing  interpretive  overhead,  and 
can  also  speed  up  programs  that  have  two  arguments  that  change 
value  at  different  rates  (such  as  ray  tracing  (Mogensen,  1986)). 

—  The  slice  of  a  program  with  respect  to  a  set  of  program  elements  S 
is  a  projection  of  the  program  that  includes  only  program  elements 
that  might  affect  (either  directly  or  transitively)  the  values  of  the 
variables  used  at  members  of  S  (Weiser,  1984;  Ottenstein  and 
Ottenstein,  1984;  Horwitz  et  al.,  1990).  Given  a  program  that  com¬ 
putes  a  function  F.  one  version  of  the  slicing  problem  focuses  on 
creating  the  slice  by  symbolically  composing  the  original  program 
with  an  appropriate  projection  function  i r,  where  7r  characterizes 
what  part  of  F's  output  should  be  discarded  and  what  part  should 

1  In  this  paper,  we  do  not  generally  make  a  distinction  between  programs  and 
procedures.  We  use  “program”  both  to  refer  to  the  program  as  a  whole,  as  well  as  to 
refer  to  individual  subroutines  in  a  generic  sense.  We  use  “procedure”  only  in  places 
where  we  wish  to  emphasize  that  the  focus  of  interest  is  an  individual  subroutine 
per  se. 
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be  retained  (Reps  and  Turnidge,  1996).  Program  slicing  creates  a 
program  that  computes  F" .  where 

F*{x)  =  (vr  o  F)(x).  (2) 

Program-slicing  tools  allow  one  to  find  semantically  meaningful 
decompositions  of  programs,  where  the  decompositions  consist  of 
elements  that  are  not  textually  contiguous.  Slicing,  and  subse¬ 
quent  manipulation  of  slices,  has  applications  in  many  software¬ 
engineering  tools,  including  ones  for  program  understanding,  main¬ 
tenance  (Gallagher  and  Lyle,  1991),  debugging  (Lyle  and  Weiser, 
1986),  testing  (Binkley,  1992;  Bates  and  Horwitz,  1993),  differenc¬ 
ing  (Horwitz  et  al.,  1989;  Horwitz,  1990),  specialization  (Reps  and 
Turnidge,  1996),  reuse  (Ning  et  al.,  1994),  and  merging  (Horwitz 
et  al.,  1989). 

Less  well  known  in  the  programming-languages  community  is  the 
work  that  has  been  done  by  numerical  analysts  on  tools  for  com¬ 
putational  differentiation  (also  known  as  automatic  differentiation  or 
algorithmic  differentiation)  (Wengert,  1964;  Rail,  1981;  Griewank  and 
Corliss,  1992;  Berz  et  al.,  1996;  Griewank,  2000): 

—  Given  a  program  that  computes  a  numerical  function  F(x),  a 
computational-differentiation  tool  creates  a  related  program  that 
computes  F'  lx)  (the  derivative  of  F) . 

—  Applications  of  computational  differentiation  include  optimization, 
solving  differential  equations,  curve  fitting,  and  sensitivity  analy¬ 
sis. 

Although  in  each  of  the  cases  mentioned  above,  the  appropriate  restruc¬ 
turing  of  the  program  could  be  carried  out  by  hand,  hand  transforma¬ 
tion  of  a  program  is  an  error-prone  process.  The  automated  assistance 
that  the  aforementioned  tools  provide  prevents  errors  from  being  intro¬ 
duced  when  these  transformations  are  applied.  The  work  described  in 
this  paper  expands  the  set  of  tools  that  programmers  have  at  their  dis¬ 
posal  for  performing  such  high-level,  semantically  meaningful  program 
manipulations. 

Because  so  much  scientific,  engineering,  and  graphical  software  tries 
to  predict  and  render  modeled  situations,  such  software  often  performs 
extrapolation  and/or  interpolation.  These  operations  inevitably  involve 
the  computation  of  divided  differences,  which  can  suffer  badly  from 
round-off  error.  In  some  cases,  which  motivate  this  paper,  numerically 
unstable  algorithms  can  be  stabilized  by  computing  divided  differences 
in  a  non-standard  way.  Because  it  can  be  a  tedious  task  to  restructure 
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a  program  to  perform  computations  in  this  way,  it  is  worthwhile  de¬ 
veloping  an  automated  approach.  The  paper  describes  how  techniques 
similar  to  those  that  have  been  developed  for  computational  differen¬ 
tiation  can  be  used  to  transform  programs  that  compute  numerical 
functions  into  ones  that  compute  divided  differences. 

The  specific  technical  contributions  of  the  paper  are  as  follows: 


—  We  present  a  program  transformation  that,  given  a  numerical  func¬ 
tion  F(x)  defined  by  a  program,  creates  a  program  that  computes 
F[x o,aq],  the  first  divided  difference  of  F(x ),  where 


F[x  o,xi]d=f 


F(xp)-F(xi ) 
xo—xi 

j;F(z),  evaluated  at  z 


if  xq  /  xi 
Xq  if  X(j  —  X\ 


—  We  show  how  computational  first  divided  differencing  generalizes 
computational  differentiation. 

—  We  present  a  second  program  transformation  that  permits  the 
creation  of  higher-order  divided  differences  of  a  numerical  function 
defined  by  a  program. 

—  We  present  a  third  program  transformation  that  permits  higher- 
order  divided  differences  to  be  computed  more  efficiently.  This 
transformation  does  not  apply  to  all  programs;  however,  we  show 
that  there  is  at  least  one  important  situation  where  this  optimiza¬ 
tion  is  of  use. 


—  We  show  how  to  extend  these  techniques  to  handle  functions  of 
several  variables. 


—  Finally,  we  describe  how  our  work  on  computational  differenc¬ 
ing  relates  to  the  numerical- finite-differencing  techniques  that  mo¬ 
tivated  Robert  Paige’s  work  on  finite  differencing  of  set-valued 
expressions  in  SETL  programs  (Paige,  1981;  Paige  and  Koenig, 
1982). 

These  ideas  are  illustrated  by  means  of  numerous  examples. 

Such  program  transformations  can  be  implemented  either  as  a  source- 
to-source  translation,  or  by  means  of  overloaded  operators  and  reinter¬ 
preted  operands  (in  which  case  the  source  code  is  changed  very  little). 
The  examples  in  the  paper  primarily  illustrate  the  latter  approach; 
the  paper  presents  sketches  of  implementations  of  the  various  trans¬ 
formations  in  the  form  of  C++  class  definitions  (“divided-difference 
arithmetics” ) . 

The  benefits  gained  from  the  techniques  described  in  the  paper 
include  the  following: 
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—  Because  divided  differences  are  the  basis  for  a  wide  variety  of  nu¬ 
merical  techniques,  including  polynomial  interpolation,  numerical 
integration,  and  solving  differential  equations  (Conte  and  de  Boor, 
1972),  this  work  could  lead  to  more  robust  programs  in  scientific 
and  graphics  applications,  when  the  function  of  interest  is  one  that 
is  defined  by  a  program. 

—  Finite  differences  on  an  evenly  spaced  grid  can  be  used  to  quickly 
generate  a  function’s  values  at  any  number  of  points  that  ex¬ 
tend  the  grid  (see  (Goldstine,  1977)  and  (Paige  and  Koenig,  1982, 
pp.  403-404)).  Because  finite  differences  on  an  evenly  spaced  grid 
can  be  obtained  from  divided  differences  on  an  evenly  spaced  grid, 
our  techniques  may  be  useful  in  graphics  applications  for  quickly 
plotting  a  function,  while  retaining  reasonable  accuracy. 

—  Because  the  divided-differencing  problems  that  we  address  can 
be  viewed  as  generalizations  of  problems  such  as  differentiation, 
computation  of  Taylor  coefficients,  etc.,  some  of  our  techniques — in 
particular,  the  divided-difference  arithmetic  presented  in  Sect.  7 — 
represent  new  approaches  that,  with  appropriate  simplification, 
can  also  be  applied  in  computational-differentiation  tools. 

Empirical  results  presented  in  Sects.  5  and  8  provide  two  concrete 
demonstrations  of  some  of  the  benefits  that  can  be  gained  via  our 
methods. 

The  remainder  of  the  paper  is  organized  into  eight  sections:  To  make 
the  paper  self-contained,  Sect.  2  provides  a  succinct  review  of  the  ba¬ 
sic  principles  of  computational  differentiation  that  are  relevant  to  our 
work,  and  of  so-called  “differentiation  arithmetics”  (Rail,  1986;  Rail, 
1990;  Rail,  1992).  Sect.  3  discusses  the  basic  principle  behind  compu¬ 
tational  divided  differencing.  Sect.  4  shows  how  computational  divided 
differencing  generalizes  computational  differentiation.  Sect.  5  extends 
the  ideas  introduced  in  Sect.  3  to  higher-order  computational  divided 
differencing.  Sect.  6  discusses  techniques  that  apply  to  a  useful  special 
case.  Sect.  7  extends  the  ideas  from  Sects.  3,  5,  and  6  to  functions 
of  several  variables.  Sect.  8  describes  how  these  ideas  relate  to  the 
numerical-finite-differencing  techniques  that  motivated  Robert  Paige’s 
work  on  finite  differencing  of  set- valued  expressions  in  SETL  programs. 
Sect.  9  discusses  other  related  work. 
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2.  Background  on  Computational  Differentiation 

One  bright  spot  during  the  last  thirty  years  with  respect  to  the  control 
of  round-off  error  has  been  the  emergence  of  tools  for  computational 
differentiation ,  which  transform  a  program  that  computes  a  numerical 
function  F(x)  into  a  related  program  that  computes  the  derivative 
F'(x).2  These  tools  address  the  following  issue:  Suppose  that  you  have 
a  program  F(x)  that  computes  a  numerical  function  Fix).  ’’  It  is  a  very 
bad  idea  to  try  to  compute  F'(x o),  the  value  of  the  derivative  of  F  at  xq. 
by  picking  a  small  value  deltajc  and  invoking  the  following  program 
with  the  argument  x0:4 


(2) 


For  a  small  enough  value  of  deltas,  the  values  of  F(x0+deltajc)  and 
F  (x0)  will  usually  be  very  close.  Round-off  errors  in  the  computation  of 
F  (xo+deltajc)  and  F(x0)  are  magnified  by  the  subtraction  of  the  two 
quantities,  and  further  amplified  by  the  division  by  the  small  quantity 
deltajc,  which  may  cause  the  overall  result  to  be  useless.  Computa¬ 
tional  differentiation  sidesteps  this  problem  by  computing  derivatives 
in  another  fashion. 

Computational  differentiation  can  be  illustrated  by  means  of  the 
following  example: 

EXAMPLE  2.1.  (Zippel,  1996).  Suppose  that  we  have  been  given  a 
collection  of  programs  fi  for  the  functions  1  <i<  k.  together 

2  Another  bright  spot  has  been  the  application  of  interval  arithmetic  to  the 
verification  of  the  accuracy  of  computed  results,  for  many  basic  numerical  com¬ 
putations  (Hammer  et  al.,  1993;  Hammer  et  al.,  1995). 

3  Throughout  the  remainder  of  the  paper,  Courier  Font  is  used  to  denote  func¬ 
tions  defined  by  programs,  whereas  Italic  Font  is  used  to  denote  mathematical 
functions.  That  is,  F(x )  denotes  a  function  (evaluated  over  real  numbers),  whereas 
F(x)  denotes  a  program  (evaluated  over  floating-point  numbers).  We  adhere  to  this 
convention  both  in  concrete  examples  that  involve  C++  code,  as  well  as  in  more 
abstract  discussions  in  order  to  distinguish  between  a  mathematical  function  and  a 
program  that  implements  the  function. 

4  The  example  programs  in  the  paper  are  all  written  in  C++;  however,  the 
ideas  described  apply  to  other  programming  languages — including  functional  pro¬ 
gramming  languages  (cf.  (Karczmarczuk,  1999)) — as  well  as  to  other  imperative 
languages. 

To  emphasize  the  links  between  mathematical  concepts  and  their  implementar 
tions  in  C++,  we  take  the  liberty  of  sometimes  using  '  and/or  subscripts  on  C++ 
identifiers. 


float  delta_x  =  ... (some  small  value)  .  . . ; 
float  F'jiaive  (float  x){ 

return  (F(x  +  delta_x)  -  F (x) ) /delta jc ; 

} 
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with  the  program  Prod  shown  below,  which  computes  the  function 
Prod(x)  =  Ui=ifi  (x).  In  addition,  suppose  that  we  have  also  been 
given  programs  fi  for  the  functions  //,  1  <  i  <  k.  Finally,  suppose 
that  we  wish  to  obtain  a  program  Prod'  that  computes  the  function 
Prod'(x).  Column  two  of  the  table  given  below  shows  mathematical 
expressions  for  Prod(x)  and  Prod'(x).  Column  three  shows  two  C++ 
procedures:  Procedure  Prod  computes  Prod(x );  procedure  Prod'  is  the 
procedure  that  a  computational-differentiation  system  would  create  to 
compute  Prod'(x). 


Mathematical  Notation 

Programming  Notation 

Function 

k 

Prod(x)  =  Y[fi(x) 

i= 1 

float  Prod (float  x) { 
float  ans  =  1.0; 
for  (int  i  =  1;  i  <=  k;  i++){ 
ans  =  ans  *  fi(x); 

} 

return  ans ; 

} 

Derivative 

k 

Prod' (x)  =  £/!(*)  *  Y[fj{x) 
i= 1 

float  Prod' (float  x){ 
float  ans'  =  0.0; 
float  ans  =  1.0; 
for  (int  i  =  1;  i  <=  k;  i++){ 
ans'  =  ans'  *  fi(x)  +  ans  *  ff(x); 
ans  =  ans  *  fi(x); 

} 

return  ans' ; 

} 

Notice  that  program  Prod'  resembles  program  Prod,  as  opposed  to 
F'jiaive  (see  box  (2)).  Prod'  preserves  accuracy  in  its  computation  of 
the  derivative  because,  as  illustrated  below  in  Example  2.2,  it  is  based 
on  the  rules  for  the  exact  computation  of  derivatives,  rather  than  on 
the  kind  of  computation  performed  by  F'maive.  □ 

The  transformation  illustrated  above  is  merely  one  instance  of  a  gen¬ 
eral  transformation  that  can  be  applied  to  any  program:  Given  a  pro¬ 
gram  G  as  input,  the  transformation  produces  a  derivative-computing 
program  G'.  The  method  for  constructing  G'  is  as  follows: 

—  For  each  variable  v  of  type  float  used  in  G,  another  float  variable 
v'  is  introduced. 
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—  Each  statement  in  G  of  the  form  “v  =  exp ; " ,  where  exp  is  an 
arithmetic  expression,  is  transformed  into  “v'  =  exp' ;  v  =  exp ;  ” , 
where  exp'  is  the  expression  for  the  derivative  of  exp.  If  exp  involves 
calls  to  a  function  g,  then  exp'  may  involve  calls  to  both  g  and  g'. 

—  Each  return  statement  in  G  of  the  form  “return  v ;  ”  is  transformed 
into  “return  v' ;  ”, 

In  general,  this  transformation  can  be  justified  by  appealing  to  the 
chain  rule  of  differential  calculus  (see  below) . 

EXAMPLE  2.2.  For  Example  2.1,  we  can  demonstrate  the  correct¬ 
ness  of  the  transformation  by  symbolically  executing  Prod'  for  a  few 
iterations,  comparing  the  values  of  ans'  and  ans  (as  functions  of  x)  at 
the  start  of  each  iteration  of  the  for-loop: 


Iteration 

Value  of  ans'  (as  a  function  of  x) 

Value  of  ans 
(as  a  function  of  x) 

0 

0.0 

1.0 

1 

fi(x) 

fi(x) 

2 

f  j(x)  *  f2(x)  -t-fi(x)  *f2(x) 

fi(x)  *f2(x) 

3 

f i(x)  *  f2(x)  *  f 3 (x) 

+  f  l(x)  *  f2(x)  *  f 3(x) 

+  fl(x)  *f2(x)  *f3(X) 

fi(x)  *  f2(x)  *  f3(x) 

k 

k 

Ef'i(x)*Ilfi(x) 

*= 1 

n*iW 

i=  1 

The  loop  maintains  the  invariant  that,  at  the  start  of  each  iteration, 
ans'(x)  =  ^ans(x).5  □ 

For  the  computational-differentiation  approach,  we  did  not  really 
need  to  make  the  assumption  that  we  were  given  programs  fj  for 

5  The  value  of  ans'  on  the  3rd  iteration  would  actually  be  computed  with 
the  terms  grouped  as  follows:  (f'1(x)*f2(x)+fi(x)*f2(x))*f3(x)+(fi(x)*f2(x))*f3(x). 
Terms  have  been  expanded  in  the  table  given  above  to  clarify  how  ans'  builds  up 

a  value  that  is  equivalent — from  the  standpoint  of  evaluation  in  real  arithmetic — to 

k 

Prod'(x)  = 

i=l  j^i 
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the  functions  //,  1  <  i  <  k:  instead,  the  programs  f  (  can  be  gener¬ 
ated  from  the  programs  fi  by  applying  the  same  statement-doubling 
transformation  that  was  applied  to  Prod. 

In  languages  that  support  operator  overloading,  such  as  C++,  Ada, 
and  Pascal-XSC,  computational  differentiation  can  be  carried  out  by 
defining  a  new  data  type  that  has  fields  for  both  the  value  and  the 
derivative,  and  overloading  the  arithmetic  operators  to  carry  out  ap¬ 
propriate  manipulations  of  both  fields  (Rail,  1983;  Rail,  1984),  along  the 
lines  of  the  definition  of  the  C++  class  FloatD,  shown  in  Fig.  1.  A  class 
such  as  FloatD  is  called  a  differentiation  arithmetic  (Rail,  1986;  Rail, 
1990;  Rail,  1992). 

The  transformation  then  amounts  to  changing  the  types  of  each  pro¬ 
cedure’s  formal  parameters,  local  variables,  and  return  value  (including 
those  of  the  f  i).6 


EXAMPLE  2.3.  Using  class  FloatD,  the  Prod  program  of  Example  2.1 
can  be  handled  as  follows: 


float  fj+float  x)  {  .  .  .  } 

FloatD  fi (const  FloatD  &x){...} 

float  fk (float  x) { . .  .  } 

FloatD  fk (const  FloatD  &x){...} 

float  Prod (float  x){ 

FloatD  Prod (const  FloatD  &x){ 

float  ans  =  1.0; 

FloatD  ans (CONST, 1 . 0) ;  //  ans  =  1.0 

for  (int  i  =  1;  i  <=  k;  i++){ 

for  (int  i  =  1;  i  <=  k;  i++){ 

ans  =  ans  *  fi(x); 

ans  =  ans  *  fi(x); 

} 

} 

return  ans ; 

return  ans ; 

} 

} 

float  Prod' (float  x)  { 

FloatD  xD(VAR,x); 
return  Prod(xD) . val' ; 

} 

By  changing  the  types  of  the  formal  parameters,  local  variables,  and 
the  return  values  of  Prod  and  the  fi  (and  making  a  slight  change 

6  We  have  referred  to  both  computational  differentiation  and  computational  di¬ 
vided  differencing  as  “program  transformations” ,  which  may  conjure  up  the  image 
of  tools  that  perform  source-to-source  rewriting  fully  automatically.  Although  this  is 
one  possible  embodiment,  in  this  paper  the  term  “transformation”  will  also  include 
the  use  of  C++  classes  in  which  the  arithmetic  operators  have  been  overloaded. 
With  the  latter  approach,  rewriting  might  be  carried  out  by  a  preprocessor,  but 
might  also  be  performed  by  hand,  since  usually  only  light  rewriting  of  the  program 
source  text  is  required. 
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enum  ArgDesc  {  CONST,  VAR  }; 
class  FloatD  { 
public : 
float  val'; 
float  val; 

FloatD (ArgDesc , float) ; 

}; 

//  Constructor  to  convert  a  constant 
//  or  a  value  for  the  independent 
//  variable  to  a  FloatD 
FloatD: : FloatD (ArgDesc  a,  float  v)  { 
switch  (a)  { 
case  CONST: 
val'  =  0.0; 
val  =  v; 
break; 
case  VAR: 
val'  =1.0; 
val  =  v; 
break; 

} 

} 

FloatD  operator+ (FloatD  a,  FloatD  b){ 

FloatD  ans ; 

ans.val'  =  a.  val'  +  b.val'; 
ans. val  =  a. val  +  b.val; 
return  ans ; 

} 

FloatD  operator* (FloatD  a,  FloatD  b){ 

FloatD  ans; 

ans.val'  =  a. val  *  b.val'  +  a. val'  *  b.val; 
ans.val  =  a. val  *  b.val; 
return  ans ; 

} 


Figure  1.  A  differentiation-arithmetic  class. 
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to  the  initialization  of  ans  in  Prod),  the  program  now  carries  around 
derivative  values  (in  the  valr  field)  in  addition  to  performing  all  of  the 
work  performed  by  the  original  program.  Because  of  the  C++  overload- 
resolution  mechanism,  the  fi  procedures  invoked  in  the  fourth  line  of 
the  transformed  version  of  Prod  are  the  transform, ed  versions  of  the  f  i 
(i.e.,  the  fi  of  type  FloatD  — >  FloatD). 

The  value  of  Prod’s  derivative  at  v  is  obtained  by  calling  Prod'(v). 

□ 


In  a  differentiation  arithmetic,  each  procedure  in  the  user’s  program, 
such  as  Prod  and  the  fi  in  Example  2.3,  can  be  viewed  as  a  box  that 
maps  two  inputs  to  two  outputs,  as  depicted  below: 


V 


V 


w 


F 


I 

Computational 

Differentation 


i 


Differentiating 
version  of  F 

F(v) 


F  (v) 

F'  (v) *w 


In  particular,  in  each  differentiating  version  of  a  user-defined  or  library 
procedure  F,  the  lower- right-hand  output  produces  the  value  F'  (v)  *w. 

An  input  value  v  for  the  formal  parameter  is  treated  as  a  pair 
(v ,  1 . 0) .  Boxes  like  the  one  shown  above  “snap  together” :  when  F 
is  composed  with  G  (and  the  input  is  v),  the  output  value  on  the 
lower-right-hand  side  is  F;  (Gfvl^G'Cv),  which  agrees  with  the  usual 
expression  for  the  chain  rule  for  the  first-derivative  operator: 


F(G(v) ) 


F  (G  (v)  ) 

F'  (G  (v)  )  *G'  (v) 


The  computational-differentiation  technique  summarized  above  is 
what  is  known  as  forward-mode  differentiation.  A  different  computational- 
differentiation  technique,  reverse  mode  (Linnainmaa,  1976;  Speelpen- 
ning,  1980;  Iri,  1984;  Griewank,  1989;  Griewank,  1991),  is  generally 
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preferable  when  the  number  of  independent  variables  is  much  greater 
than  the  number  of  dependent  variables.  However,  although  it  is  pos¬ 
sible  to  develop  a  reverse-mode  version  of  computational  divided  dif¬ 
ferencing,  it  does  not  appear  to  offer  the  same  potential  savings  in 
operations  performed  that  reverse  mode  achieves  for  computational  dif¬ 
ferentiation.  Because  the  remainder  of  the  paper  concerns  the  general¬ 
ization  of  forward-mode  computational  differentiation  to  forward-mode 
computational  divided  differencing,  reverse  mode  is  not  summarized 
here. 

The  availability  of  overloading  makes  it  possible  to  implement  (forward¬ 
mode)  computational  differentiation  conveniently,  by  packaging  it  as  a 
differentiation-arithmetic  class,  as  illustrated  above.  The  alternative  to 
the  use  of  overloading  is  to  build  a  special-purpose  preprocessor  to 
carry  out  the  statement-doubling  transformation  that  was  illustrated 
in  Examples  2.1  and  2.2.  Examples  of  systems  that  use  the  latter  ap¬ 
proach  include  ADIFOR  (Bischof  et  al.,  1992;  Bischof  et  al.,  1996)  and 
ADIC  (Bischof  et  al.,  1997). 

2.1.  Limitations  of  Computational  Differentiation 

This  section  discusses  certain  limitations  of  the  computational-differentiation 
transformation.  First,  it  is  worthwhile  mentioning  that  the  presence  of 
aliasing  (e.g.,  due  to  pointers  or  reference  parameters)  is  not  a  limi¬ 
tation  of  computational  differentiation  (nor  of  computational  divided 
differencing):  The  transformations  presented  above  (as  well  as  later  in 
Sects.  3,  5,  6,  and  7)  work  properly  in  the  presence  of  aliasing  (and  are 
said  to  be  alias-safe  (Griewank,  2000). 

One  limitation  of  computational  differentiation  comes  from  the  fact 
that  a  program  F'(x)  that  results  from  computational  differentiation 
can  perform  additions  and  subtractions  for  which  there  are  no  ana¬ 
logues  in  the  original  program  F(x).  For  instance,  in  program  Prod7,  an 
addition  is  performed  in  the  statement 

ans'  =  ans'  *  fi(x)  +  ans  *  f'^x); 
whereas  no  addition  is  performed  in  the  statement 

ans  =  ans  *  fi(x); 

Consequently,  the  result  of  evaluating  F'(x)  can  be  degraded  by  round¬ 
off  error  even  when  F(x)  is  computed  accurately.  However,  the  accuracy 
of  the  result  from  evaluating  F'(x)  can  be  verified  by  performing  the 
same  computation  in  interval  arithmetic  (Rail,  1983;  Rail,  1992). 

Another  problem  that  arises  is  that  the  manner  in  which  a  func¬ 
tion  is  programmed  influences  whether  the  results  obtained  from  the 
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derivative  program  are  correct.  For  instance,  for  programs  that  use  a 
conditional  expression  or  conditional  statement  in  which  the  condition 
depends  on  the  independent  variable — i.e.,  where  the  function  is  defined 
in  a  piecewise  manner — the  derivative  program  may  not  produce  the 
correct  answer. 

EXAMPLE  2.4.  (Fischer,  1992).  Suppose  that  the  function  F(x )  =  x2 
is  programmed  using  a  conditional  statement,  as  shown  below  on  the 
left: 


float  F (float  x) { 
float  ans ; 
if (x  ==  1.0) { 
ans  =  1.0; 

} 

else{ 

ans  =  x*x; 

} 

return  ans ; 

} 


float  F'  (float  x) { 
float  ans'; 
float  ans; 
if (x  ==  1.0) { 
ans'  =  0.0; 
ans  =  1.0; 

} 

else{ 

ans'  =  x+x; 
ans  =  x*x; 

} 

return  ans' ; 

} 


Computational  differentiation  would  produce  the  program  shown  above 
on  the  right.  With  this  program,  F; (1 . 0)  returns  0. 0,  rather  than  the 
correct  value  of  2.0  (i.e.,  correct  with  respect  to  the  meaning  of  the 
program  as  the  mathematical  function  F(x)  —  x 2).  □ 

The  phenomenon  illustrated  in  Example  2.4  has  been  called  the 
branch  problem  or  the  if  problem  for  computational  differentiation.  A 
more  important  example  of  the  branch  problem  occurs  in  Gaussian 
elimination  code,  where  pivoting  introduces  branches  into  the  pro¬ 
gram  (Fischer,  1992;  Beck  and  Fischer,  1994;  Griewank,  2000).  Some 
additional  problems  that  can  arise  with  computational  differentiation 
are  identified  in  (Fischer,  1992).  A  number  of  different  approaches  to 
these  problems  have  been  discussed  in  the  literature  (Fischer,  1992; 
Beck  and  Fischer,  1994;  Shamseddine  and  Berz,  1996;  Kearfott,  1996; 
Griewank,  2000). 

Computational  divided  differencing  has  some  similar  (or  even  worse) 
problems.  All  of  these  issues  are  outside  the  scope  of  the  present  paper; 
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the  problem  of  finding  appropriate  ways  to  generalize  the  aforemen¬ 
tioned  techniques  to  handle  the  problems  that  arise  with  computational 
divided  differencing  is  left  for  future  work. 


3.  Computational  Divided  Differencing 


In  this  paper,  we  exploit  the  principle  on  which  computational  differ¬ 
entiation  is  based — namely,  that  it  is  possible  to  differentiate  entire 
programs ,  not  just  expressions — do  develop  a  variety  of  new  computa¬ 
tional  divided- differencing  transformations.  We  develop  several  trans¬ 
formations  that  can  be  applied  to  numerical  programs.  One  of  these 
corresponds  to  the  first-divided-difference  operator ,  denoted  by  -[xq.  x  1  ] 
and  defined  as  follows: 


F[x 0,si]  d=f 


F(xq)-F(xi) 

XQ-Xl 


if  xq  /  xi 
evaluated  at  z  —  xq  if  xq  —  x\ 


(3) 


As  with  the  differentiation  operator,  the  problem  that  we  face  is  that 
because  division  by  a  small  value  and  subtraction  are  both  operations 
that  amplify  accumulated  round-off  error,  direct  use  of  Eqn.  (3)  may 
lead  to  highly  inaccurate  results.  In  contrast,  given  a  program  that  com¬ 
putes  a  numerical  function  F(x ),  our  technique  for  computational  first 
divided  differencing  creates  a  related  program  that  computes  F[x o,aq], 
but  without  directly  evaluating  the  right-hand  side  of  Eqn.  (3). 

As  we  show  below,  the  program  transformation  that  achieves  this 
goal  is  quite  similar  to  the  transformation  used  in  computational-differentiation 
tools.  The  transformed  program  sidesteps  the  explicit  subtraction  and 
division  operations  that  appear  in  Eqn.  (3),  while  producing  answers 
that  are  equivalent  (from  the  standpoint  of  evaluation  in  real  arith¬ 
metic)  .  The  program  that  results  thereby  avoids  many  operations  that 
could  potentially  amplify  round-off  error,  and  hence  retains  accuracy 
when  evaluated  in  floating-point  arithmetic. 

To  understand  the  basis  of  the  idea,  consider  the  case  in  which 
F(x)  —  x2  and  xq  /  xp. 


F[x  0,xi] 


F{x q)  -  F(x i) 

Xq  -  Xi 


J2  ‘2 
x0  xl 

Xq  -  Xi 


—  Xq  +  X\. 


(4) 


That  is,  the  first  divided  difference  can  be  obtained  by  evaluating  xq  + 
x  [ .  In  general,  for  monomials  we  have: 


F(x) 

c 

X 

X2 

X3 

F[x0,x  i] 

0 

1 

Xq  +  Xi 

X%  +  XqXi  +  xf 
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Turning  to  programs,  suppose  that  we  are  given  the  following  pro¬ 
gram  for  squaring  a  number: 


float  Square (float  x){ 
return  x  *  x; 

} 


The  above  discussion  implies  that  to  compute  the  first  divided-difference 
of  Square,  we  have  our  choice  between  the  programs  Square_lDD  jiaive 
and  Square_lDD: 


float  Square_lDD_naive (float  xO, float  xl){ 
return  (Square (xO)  -  Square (xl) )/ (xO  -  xl) ; 

} _ 

float  Square_lDD (float  xO , float  xl){ 
return  xO  +  xl; 

} 


However,  the  round-off-error  characteristics  of  Square_lDD  are  much 
better  than  those  of  Square_lDDjiaive. 

The  basis  for  creating  expressions  and  programs  that  compute  ac¬ 
curate  divided  differences  is  to  be  found  in  the  basic  properties  of 
the  first-divided-difference  operator  (Krawczyk  and  Neumaier,  1985), 
which  closely  resemble  those  of  the  first-derivative  operator,  as  shown 
in  Table  I: 

The  program  transformation  for  performing  computational  divided 
differencing  can  be  explained  by  means  of  an  example. 

EXAMPLE  3.1.  Suppose  that  we  have  a  C++  class  Poly  that  repre¬ 
sents  polynomials,  and  a  member  function  Poly :  :  Eval  that  evaluates  a 
polynomial  via  Horner’s  rule;  i.e.,  it  accumulates  the  answer  by  repeat¬ 
edly  multiplying  by  x  and  adding  in  the  current  coefficient,  iterating 
down  from  the  high-order  coefficient:7 

7  As  a  running  example,  the  paper  uses  the  evaluation  of  a  polynomial  in  x  via 
Horner’s  rule.  It  is  well-known  that  Horner’s  rule  can  return  inaccurate  results  when 
it  is  used  to  evaluate  a  polynomial  in  floating-point  arithmetic  (Hammer  et  al., 
1995,  pp.  65-67).  Our  examples  are  not  meant  to  illustrate  a  way  to  circumvent 
this  shortcoming.  Horner’s  rule  is  used  as  the  basis  of  our  examples  solely  because 
of  one  virtue:  it  is  a  very  simple  procedure,  which  allows  the  various  different 
computational-divided-differencing  transformations  to  be  illustrated  in  a  succinct 
manner. 
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Table  I.  Basic  properties  of  the  first-derivative  and  first-divided-difference  operators. 


First  Derivative 

First  Divided  Difference  (Krawczyk  and  Neumaier,  1985) 

c'  =  0.0 

c[x  o,  xi\  =  0.0 

x1  =  1.0 

x\xq,xi]  =  1.0 

{c  +  F)'{x)  =F'(x) 

(c  +  F)[x  0,xi\  =  F[x  0,xi] 

(c  *  F)'(x)  =  c*  F'(x ) 

(c*F)[x0,x  i]  =  c*F[x0,x  i] 

(F  +  G)'{x)  =F'{x)+G'{x) 

(F  +  G)[x0,x  i]  =  F[x0,x  i]  +  G[x0,xi] 

(F  *  G)'(x)  —  F'W*GW 

(  )U  +  F(x)*G'(x) 

/J?  ,  F[x0,xi\*G(x1) 

(  G)[x0,xi]  +  f{xo)*g[xQjXi] 

_  F'(x)*G(x)-F(x)*G'(x) 

/F\ r™  ™  1  _  F[xo,xi]*G(xi)—F(xi)*G[xo,xi] 

W  -  aW 

VGJDO’^IJ  “  G(x0)*G(a:i) 

//  Evaluation  via  Horner’s  rule 
float  Poly: :Eval (float  x){ 
float  ans  =  0.0; 

for  (int  i  =  degree;  i  >=  0 ;  i — ){ 
ans  =  ans  *  x  +  coeff[i]; 

}  return  ans ; 

} 


A  new  member  function,  Poly:  :Eval_lDD,  to  compute  the  first  divided 
difference  can  be  created  by  transforming  Poly :  :  Eval  as  shown  below: 


class  Poly  { 

float  Poly: :Eval_lDD (float  xO, float  xl){ 

public : 

float  ans_lDD  =  0.0; 

float  Eval (float) ; 

float  ans  =  0.0; 

float  Eval_lDD (float , float) ; 

for  (int  i  =  degree;  i  >=  0 ;  i — ){ 

private : 

ans_lDD  =  ans_lDD  *  xl  +  ans; 

int  degree; 

ans  =  ans  *  xO  +  coeff  [i]  ; 

//  Array  coeff [0. .degree] 

} 

float  *coef f ; 

return  ans_lDD ; 

}; 

} 

□ 


class  Poly  { 
public : 

float  Eval(float); 
private : 
int  degree ; 

//  Array  coeff [0. .degree] 
float  *coef f ; 

}; 
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The  transformation  used  to  obtain  Eval_lDD  from  (the  text  of) 
Eval  is  similar  to  the  computational-differentiation  transformation  that 
would  be  used  to  create  a  derivative-computing  program  Eval7  (Eval7 
appears  in  Fig.  2): 

—  Eval_lDD  is  supplied  with  an  additional  formal  parameter  (and  the 
two  parameters  are  renamed  xO  and  xl). 

—  For  each  local  variable  v  of  type  float  used  in  Eval,  an  additional 
float  variable  v_lDD  is  introduced  in  Eval_lDD. 

—  Each  statement  of  the  form  “v  =  exp ;  ”  in  Eval  is  transformed  into 
“v_lDD  =  exp[xO,xl];  v  =  exp0;v,  where  exp [xO,xl]  is  the  ex¬ 
pression  for  the  divided  difference  of  exp,  and  exp0  is  exp  with  xO 
substituted  for  all  occurrences  of  x. 

—  Each  statement  of  the  form  “return  v”  in  Eval  is  transformed 
into  “return  v_lDD”. 

One  caveat  concerning  the  transformation  presented  above  should  be 
noted:  the  transformation  applies  only  to  procedures  that  have  a  certain 
special  syntactic  structure — namely,  the  only  multiplication  operations 
that  depend  on  the  independent  variable  x  must  be  multiplications  on 
the  right  by  x.  Procedure  Eval  is  an  example  of  a  procedure  that  has 
this  property. 

A  different,  but  similar,  transformation  can  be  used  if  all  of  the 
multiplication  operations  that  depend  on  the  independent  variable  x 
are  multiplications  on  the  left  by  x.  (This  point  is  discussed  further  in 
Example  6.1.)  It  is  also  possible  to  give  a  fully  general  first-divided- 
difference  transformation;  however,  this  transformation  can  be  viewed 
as  a  special  case  of  the  material  presented  in  Sect.  5.  Consequently,  we 
will  not  pause  to  present  the  general  first-divided-difference  transfor¬ 
mation  here. 

Alternatively,  as  with  computational  differentiation,  for  languages 
that  support  operator  overloading,  computational  divided  differencing 
can  be  carried  out  with  the  aid  of  a  new  class,  say  Float  1DD,  for  which 
the  arithmetic  operators  are  appropriately  redefined.  (We  will  call  such 
a  class  a  divided-difference  arithmetic.)  Computational  divided  differ¬ 
encing  is  then  carried  out  by  making  appropriate  changes  to  the  types  of 
each  procedure’s  formal  parameters,  local  variables,  and  return  value. 

Again,  definitions  of  first-divided-difference  arithmetic  classes — both 
for  the  case  of  general  first  divided  differences,  as  well  as  for  the  special 
case  that  covers  programs  like  Eval — can  be  viewed  as  special  cases 
of  the  divided-difference  arithmetic  classes  FloatDD  and  FloatDDRl 
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discussed  in  Sects.  5  and  6,  respectively.  For  this  reason,  we  post¬ 
pone  giving  a  concrete  example  of  a  divided-difference  arithmetic  until 
Sects.  5  and  6. 


4.  Computational  Divided  Differencing  as  a  Generalization 
of  Computational  Differentiation 


In  this  section,  we  explain  in  what  sense  computational  divided  differ¬ 
encing  can  be  said  to  generalize  computational  differentiation.  First, 
observe  that  over  real  numbers,  we  have 


lim  Fix o,a?i] 

Xl—tXQ 


lim 

Xl—tXO 


F{x q)  -  F(x i) 
xq  -  xl 


F'(x  o). 


(5) 


However,  although  Eqn.  (5)  holds  over  reals,  it  does  not  hold  over 
floating-point  numbers:  as  x*  approaches  x0,  because  of  accumulated 
round-off  error,  the  quantity  does  not ,  in  general,  approach 

F'(x0).  (Note  the  use  of  Courier  Font  here;  this  is  a  statement  about 
quantities  computed  by  programs.)  This  is  why  derivatives  cannot  be 
computed  accurately  by  procedure  F'maive  (see  box  (2)). 

In  contrast,  for  the  programs  F[x0,xi]  and  F;(xo),  we  do  have  the 
property  that 

1™  F[x0,xi]  =  F'(x0).  (6) 

Xi  — ^Xo 

More  precisely,  the  property  that  holds  is  that  we  have  equality  when 
X!  equals  x0: 

F[x0,x0]  =  F'(xq).  (7) 


EXAMPLE  4.1.  To  illustrate  Eqn.  (7),  consider  applying  the  two  trans¬ 
formations  to  member  function  Poly:  :Eval  of  Example  3.1;  the  result 
is  shown  in  Fig.  2.  When  formal  parameters  xO,  xl,  and  x  all  have  the 
same  value — say  v — then  exactly  the  same  operations  are  performed  by 
Eval_lDD(v,v)  and  Eval7 (v).  □ 


Because  computations  are  carried  out  over  floating-point  numbers, 
the  programs  F[x0,  xi]  and  F'(x0)  are  only  approximations  to  the  func¬ 
tions  that  we  actually  desire.  That  is,  F[x0,xi]  approximates  the  func¬ 
tion  F[x o,  aq],  and  Fr(x0)  approximates  F'(x o).  The  relationships  among 
these  functions  and  programs  are  depicted  below: 
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class  Poly  { 
public : 

float  Eval (float); 
float  Eval_lDD (float , float) ; 
float  Eval' (float) ; 
private : 
int  degree ; 

//  Array  coeff [0. .degree] 
float  *coef f ; 

}; 

float  Poly: :Eval_lDD (float  xO , float  xl){ 
float  ans_lDD  =  0.0; 
float  ans  =  0.0; 

for  (int  i  =  degree;  i  >=  0 ;  i — ){ 
ans_lDD  =  ans_lDD  *  xl  +  ans; 
ans  =  ans  *  xO  +  coeff [i] ; 

} 

return  ans_lDD; 

} 

float  Poly:  : Eval' (float  x){ 
float  ans'  =  0.0; 
float  ans  =  0.0; 

for  (int  i  =  degree;  i  >=  0 ;  i — ){ 
ans'  =  ans'  *  x  +  ans ; 
ans  =  ans  *  x  +  coeff  [i] ; 

} 

return  ans'; 

} 


Figure  2.  The  result  of  applying  the  computational-differentiation  and 
first-divided-difference  transformations  to  member  function  Poly:  :Eval  of 
Example  3.1. 
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Programs  and 
Program  Transformations 


Standard 

Divided  Differencing 

F  (x0)  -F  (xi)  p 

Xq-Xx  xi^Xo*"^ 


As  the  discussion  above  has  noted,  the  interesting  feature  of  this  dia¬ 
gram  is  the  relationship  between  F[x0,xi]  and  F'(x0),  on  the  side  of 
the  diagram  labeled  “Programs  and  Program  Transformations”.  In 
particular,  as  approaches  x0,  F[x0. xx]  approaches  F'(x0).  Conse¬ 
quently,  a  program  produced  by  a  system  for  computational  divided 
differencing  can  be  used  to  compute  values  of  derivatives  (in  addition 
to  divided  differences)  by  feeding  it  duplicate  actual  parameters  (e.g., 
Eval_lDD(v,v)).  In  contrast,  a  program  created  by  means  of  compu¬ 
tational  differentiation  can  only  produce  derivatives  (and  not  divided 
differences).  In  this  sense,  computational  divided  differencing  can  be 
said  to  generalize  computational  differentiation. 

Computational  divided  differencing  suffers  from  one  of  the  same 
problems  that  arises  with  computational  differentiation — namely  that 
the  program  F[x0,  xjJ  that  results  from  the  transformation  can  perform 
additions  and  subtractions  that  have  no  analogues  in  the  original  pro¬ 
gram  F(x)  (see  the  discussion  in  Sect.  2.1).  Consequently,  the  result  of 
evaluating  F[x0,  xx]  can  be  degraded  by  round-off  error  even  when  F(x) 
is  computed  accurately.  However,  computational  divided  differencing  is 
no  worse  in  this  regard  than  computational  differentiation.  Moreover, 
because  of  the  fact  that  FfxojXi]  converges  to  F'(x0)  as  xt  approaches 
x0,  if  F'(x0)  returns  a  result  of  sufficient  accuracy,  then  F[x0,  xx]  will 
return  a  result  of  sufficient  accuracy  when  |x0  —  x^  is  small.  (In  future 
work,  we  plan  to  investigate  the  use  of  interval  arithmetic  for  providing 
interval  bounds  for  computational  divided  differencing.) 


5.  Higher-Order  Computational  Divided  Differencing 

In  this  section,  we  show  that  the  idea  that  was  introduced  in  Sect.  3 
can  be  generalized  to  define  a  transformation  for  higher-order  computa¬ 
tional  divided  differencing .  To  do  so,  we  will  define  a  divided-difference 
arithmetic  that  manipulates  divided-difference  tables. 
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Higher-order  divided  differences  are  divided  differences  of  divided 
differences,  defined  recursively  as  follows: 


F[x 0,xi, . . 


F[Xi }  ¥  F{Xi) 


1  def 
«£n— — 


F[xQ,Xl,...,Xn  —  l\  —  F[xi,...,Xn  —  \,Xn\ 

XQ  —  Xn 

^F[z,x 


(8) 

if  x0  /  xn  .  . 
if  xq  =  xrl 


In  our  context,  a  divided-difference  table  for  a  function  F  is  an  upper- 
triangular  matrix  whose  entries  are  divided  differences  of  different  or¬ 
ders,  as  indicated  below:8 


/  F(x 0)  F[x0lxi]  F[xq,x i,x2] 
0  F(x  i)  F[xi,x2] 

0  0  F(x  2) 

V  0  0  0 


F[x o,xi,x2,x3\  \ 
F[x  1,2:2, 2:3] 
^2,2:3] 

F(x 3)  J 


Higher-order  divided  differences  have  numerous  applications  in  inter¬ 
polation  and  approximation  of  functions  (Conte  and  de  Boor,  1972). 

We  occasionally  use  [xij]  as  an  abbreviation  for  [xt . ...,  Xj\.  However, 
the  reader  should  note  that 


^[2:0,2]  =  F[x0lxllx2] 

F\x q1£i]  -  F\xi,x2\ 

Xq  -  X2 

which  is  not  the  same  as  F[x 0,2:2]: 


F[x0l  x2\ 


F(x0)  -  F(x2) 

Xq  -  X2 


We  use  to  denote  the  operator  that  yields  the  divided- 

difference  table  for  a  function  with  respect  to  points  xo, . . .  ,xn.  (We 
use  -x  if  the  points  xo, ...  ,xn  are  clear  from  the  context.) 

A  method  for  creating  accurate  divided-difference  tables  for  rational 
expressions  is  found  in  Opitz  (Opitz,  1964).  This  method  is  based  on 
the  properties  of  given  in  the  right-hand  column  of  Table  II: 

A  few  items  in  Table  II  require  explanation: 


—  The  symbol  I  denotes  the  identity  matrix. 

8  Other  arrangements  of  F’s  higher-order  divided  differences  into  matrix  form 
are  possible.  For  example,  one  could  have  a  lower  triangular  matrix  with  the  F(xi) 
running  down  the  first  column.  However,  the  use  of  the  arrangement  shown  above 
is  key  to  being  able  to  use  simple  notation — i.e. ,  ordinary  matrix  operations — to 
describe  our  methods  (Opitz,  1964). 
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Table  II.  Basic  properties  of  two  divided-difference  operators. 


|  First  Divided  Difference  (Krawczyk  and  Neumaier,  1985) 

|  Divided-Difference  Table  (Opitz,  1964)  j 

|  c[x o,  xi]  =  0.0 

|  cN[a:o, ••■,#«]  —  c  *  J  j 

|  x[xo,  x\]  =  1.0 

*a['cb . *-  /1>C.  ,r„:  | 

(c  +  F)[xo,a:i]  =  F[x  o,xi] 

(c+  ■■•>**]  =C*I  +  . 

(c  *  F)[xo,xi]  =  c*  .F[xo,xi] 

(c*  /*‘)Tcc . =  c  * 

(F  +  G)[x  o,xi]  =  F[x  o,xi]  +  G[xq,xi\ 

(F  4-  (•;)s-'re . "“I  =  ■>*“■! 

/~i\ r  i  F[xo,xi]  *  G(xi) 

+  F(x0)  *  G[x0,xi] 

(F  *  g)^0’'"’2^  =  . *»)  *  ^**>1 

{  F\  r_  _  l  _ 

f  F\  ^[*0  »■■■»***] 

{G)lX0,Xl\-  G^oJ.G^!) 

(g)  ~ 

—  The  symbol  ■A-[Xo,...,xn\  denotes  the  matrix 


j  x  o  1  0  •••  0  \ 

0  xi  1  •••  0 


0  •••  xn-i  1 

V  0  .  0  xn  ) 


(10) 


—  In  the  entry  for  ( F  *  the  multiplication  operation  in 

F  *[xo;---,xn\  *  is  matrix  multiplication. 


In  the  entry  for  ^  \  the  division  operation  in 

is  matrix  division  (i.e.,  ^  —  P  *  Q _1). 


The  two  columns  of  Table  II  can  be  read  as  recursive  definitions  for 
the  operations  •  [xq ,  x \ ]  and  respectively.  These  have  straight¬ 

forward  implementations  as  recursive  programs  that  walk  over  an  ex¬ 
pression  tree. 

The  reader  should  be  aware  that  the  -[xq^xi]  operation  defined  in 
the  first  column  of  Table  II  creates  an  expression  that  computes  only 
the  first  divided  difference  of  the  original  expression  e(x).  whereas  the 
operation  defined  in  the  second  column  of  Table  II  creates  a  (matrix) 
expression  that  computes  all  of  the  first,  second,  . . .,  nth  divided  dif¬ 
ferences  with  respect  to  the  points  xo,  x\,  ...,  xn,  as  well  as  n  +  1 
values  of  e{x).  In  particular,  in  the  matrix  that  results  from  evaluating 
the  transformed  expression,  the  values  of  e{x)  evaluated  at  the  points 
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xq,  x\t  . ..,  xn  are  found  on  the  diagonal.  For  instance,  in  the  case 
of  an  expression  that  multiplies  two  subexpressions  F(x)  and  G(x), 
the  elements  on  the  diagonal  are  F(x o)  *  G(x o),  F(x i)  *  G(x i), 

F{xn )  *  G {xn). 

It  is  easy  to  verify  (by  means  of  induction)  that  the  first  and  second 
columns  of  Table  II  are  consistent  with  each  other:  in  each  case,  the 
quantity  e[xQ,X\\  represents  the  (0,1)  entry  of  the  matrix  e  'I10""*!. 

The  second  column  of  Table  II  has  an  even  more  straightforward 
interpretation: 

OBSERVATION  5.1.  [Reinterpretation  Principle].  The  divided-difference 
table  of  an  arithmetic  expression  e{x)  with  respect  to  the  n  +  1  points 
xo, . . . ,  xn  can  be  obtained  by  reinterpreting  e(x)  as  a  matrix  expression, 
where  the  matrix  A[a.0j ...;Xn]  is  used  at  each  occurrence  of  the  variable  x, 
and  c*  I  is  used  at  each  occurrence  of  a  constant  c.  □ 

That  is,  the  expression  tree  for  e(x)  is  unchanged — except  at  its 
leaves,  where  is  used  in  place  of  x,  and  c*I  is  used  in  place  of 

c — but  the  operators  at  all  internal  nodes  are  reinterpreted  as  denoting 
matrix  operations.  This  observation  is  due  to  Opitz  (Opitz,  1964). 

With  only  a  slight  abuse  of  notation,  we  can  express  this  as 

=  4W 

Using  this  notation,  we  can  show  that  the  chain  rule  for  the  divided- 
difference  operator  pas  the  following  particularly  simple  form: 

(F  o  Gp« . *•!  =  (F  o  (11) 

=  F(G(^[w . 

Opitz’s  idea  can  be  extended  to  the  creation  of  accurate  divided- 
difference  tables  for  functions  defined  by  programs  by  overloading  the 
arithmetic  operators  used  in  the  program  to  be  matrix  operators — i.e., 
by  defining  a  divided-difference  arithmetic  that  manipulates  divided- 
difference  tables: 

OBSERVATION  5.2.  [Computational  Divided-Differencing  Principle]. 
Rather  than  computing  a  divided- difference  table  with  respect  to  the 
points  xq,  x\,  . . .,  xn  by  invoking  the  program  n  +  1  times  and  then 
applying  Eqns.  (8)  and  (9),  we  may  instead  evaluate  the  program,  (once) 
using  a  divided-difference  arithmetic  that  overloads  arithmetic  opera¬ 
tions  as  matrix  operations,  substituting  A[a.0j ..^Xn\  for  eac/i  occurrence 
of  the  formal  parameter  x,  and  c*  I  for  each  occurrence  of  a  constant 
c.  □ 
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The  reader  should  understand  that  the  single  invocation  of  the 
program  using  the  divided-difference  arithmetic  will  actually  be  more 
expensive  than  the  n  +  1  ordinary  invocations  of  the  program.  The 
advantage  of  using  divided-difference  arithmetic  is  not  that  execution 
is  speeded  up  because  the  program  is  only  invoked  once  (in  fact,  ex¬ 
ecution  is  slower);  the  advantage  is  that  the  result  computed  using 
divided-difference  arithmetic  is  much  more  accurate. 

Because  higher-order  divided  differences  are  defined  recursively  in 
terms  of  divided  differences  of  lower  order  (cf.  Eqns.  (8)  and  (9)),  it 
would  be  possible  to  define  an  algorithm  for  higher-order  computational- 
divided-differencing  using  repeated  applications  of  lower-order  computational- 
divided-differencing  transformations.  However,  with  each  application 
of  the  transformation  for  computational  first  divided  differencing,  the 
program  that  results  performs  (roughly)  three  times  the  number  of 
operations  that  are  performed  by  the  program  the  transformation  starts 
with.  Consequently,  this  approach  has  a  significant  drawback:  the  final 
program  that  would  be  created  for  computing  kth  divided  differences 
could  be  0(3k)  times  slower  than  the  original  program.  In  contrast,  the 
slow-down  factor  with  the  approach  based  on  Observation  5.2  is  0(k 3). 

We  now  sketch  how  a  version  of  higher-order  computational  divided 
differencing  based  on  Observation  5.2  can  be  implemented  in  C++. 

Below,  we  present  highlights  of  a  divided-difference  arithmetic  class, 
named  FloatDD.  We  actually  make  use  of  two  classes:  (i)  class  FloatDD, 
the  divided-difference  arithmetic  proper,  and  (ii)  class  FloatV,  vectors 
of  xi  values.  These  classes  are  defined  as  follows: 


class  FloatDD  { 
public : 

int  numPts;  //  Size  is  numPts-by-numPts 

float  **divDif f Table ;  //  Two-dimensional  upper-triangular  array 
FloatDD (ArgDesc  ad,  int  N,  float  v) ;  //  N-by-N  constant  or  variable:  value  v 
FloatDD (const  FloatV  &)  ;  //  Construct  a  FloatDD  from  a  FloatV 
FloatDD (int  N)  ;  //  Construct  a  zero-valued  FloatDD  of  size  N-by-N 
FloatDD&  operator+  (const  FloatDD  &)  const;  //  binary  addition 

FloatDD&  operator-  (const  FloatDD  &)  const;  //  binary  subtraction 

FloatDD&  operator*  (const  FloatDD  &)  const;  //  binary  multiplication 

FloatDD&  operator/  (const  FloatDD  &)  const;  //  binary  division 

}; 
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class  FloatV  { 
public : 
int  numPts; 

float  *val;  //  An  array  of  values:  val[0] . .val [numPts -1] 

FloatV (int  N,  ...);  //  N  points 

FloatV(float  start,  int  N,  float  incr) ;  //  N  equally  spaced  points 

}; 


The  constructor  FloatDD (const  FloatV  &)  ;  plays  the  role  of  gener¬ 
ating  a  matrix  -A[XOj...jXn]  from  a  vector  [x$ ,...,xn\  of  values  for  the 
independent  variable.  It  is  defined  as  follows: 


//  Construct  a  FloatDD  from  a  FloatV 
FloatDD: : FloatDD (const  FloatV  &fv)  : 
numPts (f v . numPts) , 

divDif f Table (calloc_ut  (numPts)  )  //  allocate  upper-triangular  matrix  of  zeros 

{ 

for  (int  i  =  0;  i  <  numPts;  i++)  { 
divDif f Table  [i] [i]  =  fv.val[i]; 
if  (i  <  numPts-1)  divDif fTable  [i]  [i+1]  =  1.0; 

} 

} 


The  constructor  FloatDD (ArgDesc  ad,  int  N,  float  v)  ;  gener¬ 
ates  either  the  matrix  A [v v]  of  size  N-by-N  or  the  matrix  v  *  I  of  size 
N-by-N,  depending  on  the  value  of  parameter  ad.  It  is  defined  as  follows: 
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FloatDD : :FloatDD (ArgDesc  ad,  int  N,  float  v)  : 
numPts (N) , 

divDiff  Table  (calloc_ut  (numPts)  ) 

{ 

int  i ; 

switch  (ad)  { 
case  CONST: 

for  (i  =  0;  i  <  numPts;  i++)  { 
divDiff Table [i] [i]  =  v; 

} 

break; 
case  VAR: 

for  (i  =  0;  i  <  numPts;  i++)  { 
divDiff Table [i] [i]  =  v; 
if  (i  <  numPts  -  1) 

divDiff Table [i]  [i+1]  =  1.0; 

} 

break; 

} 

} _ 

The  multiplication  operator  of  class  FloatDD  simply  performs  matrix 
multiplication: 

FloatDD&  FloatDD :: operator*  (const  FloatDD  &fdd)  const{ 
assert (numPts  ==  f dd. numPts) ; 

FloatDD  *ans  =  new  FloatDD (numPts) ; 
for  (int  r  =  0;  r  <  numPts;  r++)  { 
for  (int  c  =  r;  c  <  numPts;  C++)  { 
float  temp  =  0.0; 
for  (int  k  =  r;  k  <=  c;  k++)  { 

temp  +=  divDif f Table  [r]  [k]  *  fdd. divDiff Table [k] [c] ; 

} 

ans->divDiff Table [r] [c]  =  temp; 

} 

} 

return  *ans; 

} 
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The  division  operator  of  class  FloatDD  is  implemented  using  back  sub¬ 
stitution.  That  is,  suppose  we  wish  to  find  the  value  of  A/B  (call  this 
value  X).  X  can  be  found  by  solving  the  system  X  *  B  =  A.  Because  the 
divided-difference  tables  A  and  B  are  both  upper-triangular  matrices, 
this  can  be  done  using  back  substitution,  as  follows: 


//  Use  back  substitution 

FloatDDft  FloatDD :: operator/  (const  FloatDD  &fdd)  const{ 
assert (numPts  ==  fdd.numPts); 
assert (NonZeroDiagonal(fdd)) ; 

FloatDD  *ans  =  new  FloatDD (numPts) ; 
for  (int  r  =  0;  r  <  numPts;  r++)  { 
for  (int  c  =  r;  c  <  numPts;  C++)  { 
float  temp  =  0.0; 
for  (int  k  =  r;  k  <  c;  k++)  { 

temp  +=  ans->divDiff Table [r] [k]  *  f dd.divDiff Table [k]  [c] ; 

} 

ans->divDiff Table [r] [c]  = 

(divDiff Table [r]  [c]  -  temp)  /  f dd.divDiff Table  [c]  [c] ; 

} 

} 

return  *ans; 

} 


EXAMPLE  5.3.  To  illustrate  these  definitions,  consider  again  the  func¬ 
tion  Poly :  :  Eval  that  evaluates  a  polynomial  via  Horner’s  rule.  Com¬ 
putational  divided  differencing  is  carried  out  by  changing  the  types  of 
Eval’s  formal  parameters,  local  variables,  and  return  value  from  float 
to  FloatDD: 
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//  Evaluation  via  Horner’s  rule 
float  Poly: :Eval (float  x) { 
float  ans  =  0.0; 

for  (int  i  =  degree;  i  >=  0;  i — ){ 
ans  =  ans  *  x  +  coef f  [i] ; 

} 

return  ans ; 

} 

//  Evaluation  via  Horner’s  rule 
FloatDD  Poly: :Eval (const  FloatDD  &x) { 

FloatDD  ans (x .numPts) ;  //  ans  =  0.0 
for  (int  i  =  degree;  i  >=  0;  i — ){ 
ans  =  ans  *  x  +  coef f  [i] ; 

} 

return  ans ; 

} _ 

The  transformed  procedure  can  be  used  to  generate  the  divided-difference 
table  for  the  polynomial 

P(x)  —  2.1  *  x3  —  1.4  *  x2  —  .6  *  x  +  1.1 

with  respect  to  the  (unevenly  spaced)  points  3.0,  3.01,  3.02,  3.05  by 
performing  the  following  operations: 

Poly  *P  =  new  Poly (4,2. 1 ,-l .4,-0. 6, 1. 1) ; 

FloatV  x(4, 3. 0,3. 01, 3. 02, 3. 05) ; 

FloatDD  A(x);  //  Corresponds  to  j4[Io . Js] 

FloatDD  fdd  =  P->Eval(A); 

We  now  present  some  empirical  results  that  illustrate  the  advantages 
of  the  computational-divided-differencing  method.  In  this  experiment, 
we  worked  with  the  polynomial 

P{x)  —  2.1  *  x3  —  1.4  *  x2  —  .6  *  x  +  1.1, 

and  performed  computations  using  single-precision  floating-point  arith¬ 
metic  on  a  Sun  SPARCstation  20/61  running  SunOS  5.6.  Programs 
were  compiled  with  the  egcs-2.91.66  version  of  g++  (egcs-1.1.2  release). 
The  experiment  compared  the  standard  method  for  generating  divided- 
difference  tables — namely,  the  recursive  definition  given  by  Eqn.  (8)  and 
the  first  line  of  Eqn.  (9) — against  the  overloaded  version  of  function 
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Poly:  :Eval  from  Example  5.3  (which  was  invoked  using  code  like  the 
fragment  shown  in  box  (5.3)). 

In  each  of  the  four  examples  shown  below,  the  values  used  for  the 
(unevenly  spaced)  points  xq.  x\ .  X2,  and  X3  are  shown  on  the  left. 
Note  how  the  standard  method  for  generating  divided-difference  tables 
degrades  as  the  points  move  closer  together.  (Places  where  the  results 
from  the  two  methods  differ  are  indicated  in  boldface.)  In  particular, 
because  P  is  a  cubic  polynomial  whose  high-order  coefficient  is  2.1, 
the  proper  value  of  P[x 0,  aq,  £2,  £3] — the  third  divided  difference  of 
P — is  2.1,  not  117,520!  (Compare  the  entries  that  appear  in  the  upper- 
right-hand  corners  of  the  fourth  pair  of  divided-difference  tables  shown 
below.) 


Computational  Divided  Differencing  Standard  Divided  Differencing 


Xo 

3.0 

/  43.4 

67.3  23.8  2.1 

\ 

( 

43.4  67.3  23.8 

2.09999 

\ 

XI 

4.0 

110.7  114.9  32.2 

110.7  114.9 

32.2 

X2 

5.0 

225.6  211.5 

225.6 

211.5 

XS 

7.0 

648.6 

\ 

648.6 

Xo 

3.0 

/  43.4 

47.8752  17.563  2.1 

\ 

/ 

43.4 

47.8749 

17.5858 

1.59073 

\ 

Xi 

3.01 

43.8787  48.2265  17.668 

43.8787 

48.2266 

17.6653 

X2 

3.02 

44.361  48.9332 

44.361 

48.9332 

X3 

3.05 

V 

45.829 

V 

45.829 

J 

Xo 

3.0 

(  43.4 

47.7175 

17.5063  2.1 

\ 

( 

43.4 

47.7177 

15.2886 

754.685 

\ 

XI 

3.001 

43.477 

47.7525  17.5168 

43.4477 

47.7483 

19.0621 

X2 

3.002 

43.4955  47.8226 

43.4955 

47.8245 

X3 

3.005 

V 

43.6389 

\ 

43.6389 

xo 

3.0  / 

43.4  47.7017 

17.5006  2.1 

\ 

( 

43.4 

47.6945 

3.62336 

117520 

\ 

Xl 

3.0001 

43.4048 

47.7052  17.5017 

43.4048 

47.6952 

62.379 

X2 

3.0002 

43.4095  47.7122 

43.4095 

47.7202 

X3 

3.0005  \ 

43.4238 

\ 

43.4238 

/ 

Finally,  when  we  set  all  of  the  input  values  to  3.0,  we  obtain 


xo  : 
xi  : 
X2  : 
X3  : 


Computational  Divided  Differencing  Standard  Divided  Differencing 


/  43.4 

47.7 

17.5 

21  \ 

/  43.4  NaN 

NaN 

NaN 

\ 

43.4 

47.7 

17.5 

43.4 

NaN 

NaN 

43.4 

47.7 

43.4 

NaN 

V 

43.4  / 

43.4 

/ 

With  the  standard  divided-differencing  method,  division  by  0  occurs 
and  yields  the  exceptional  value  NaN.  In  contrast,  computational  di¬ 
vided  differencing  produces  values  for  P’s  first,  second,  and  third  deriva¬ 
tives.  More  precisely,  each  kth  divided-difference  entry  in  the  computational- 
divided-differencing  table  equals 


1  dkP(x) 
k\  dxk 


£=3.0 


(12) 
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The  k  —  1  case  was  already  discussed  in  Sect.  4,  where  we  observed 
that  computational  first  divided  differencing  could  be  used  to  compute 
first  derivatives.  □ 


EXAMPLE  5.4.  (Kahan,  2000).  Suppose  that  we  wish  to  compute  the 
future  value  of  n  monthly  payments,  each  of  1  unit,  paid  at  the  end  of 
each  month  into  a  savings  account  that  compounds  interest  at  the  rate 
of  a  per  month  (where  a  is  a  small  positive  value  and  n  is  a  positive 
integer).  This  answers  the  question  “How  many  dollars  are  accumulated 
after  n  months,  when  you  deposit  $1  per  month  for  n  months,  into  a 
savings  account  that  pays  annual  interest  at  the  rate  of  (12  x  a  x  100)%, 
compounded  monthly?”  Future  value  can  be  computed  by  the  function 

FutureValue(a,n )  =f  —  ^ .  (13) 

a 


However,  this  can  also  be  written  as 


Future  Value(a7  n) 


((i+«r-in) 

(1  +  a)  —  1 


and  thus  is  equal  to  the  following  first-divided  difference  of  the  power 
function: 

FutureValue(a,n )  =  ( xn)[l  +  a,  1].  (14) 

The  latter  quantity  can  be  computed  to  nearly  full  accuracy  using 
computational  divided  differencing  by  computing  (x")n[1+q’1],  and  then 
extracting  the  (0, 1)  entry.  For  instance,  we  can  start  with  the  follow¬ 
ing  procedure  power,  which  computes  xn  via  repeated  squaring  and 
multiplication  by  x,  according  to  the  bits  of  argument  n: 


const  unsigned  int  num_bits  =  sizeof  (unsigned  int) *8; 

float  power (float  x,  unsigned  int  n)  { 
unsigned  int  mask  =  1  <<  (num_bits  -  1) ; 
float  ans  =  1.0; 

for  (unsigned  int  i  =  0;  i  <  num_bits;  i++)  { 
ans  =  ans  *  ans ; 
if  (mask  &  n) 
ans  =  ans  *  x; 
mask  >>=  1; 

} 

return  ans ; 

} 
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By  changing  the  types  of  power’s  formal  parameters,  local  variables, 
and  return  value,  we  create  a  version  that  computes  a  divided-difference 
table: 


FloatDD  power (FloatV  &x,  unsigned  int  n)  { 
unsigned  int  mask  =  1  <<  (num_bits  -  1) ; 
FloatDD  ans (CONST,  2,  1.0);  //  ans  =  1.0 
for  (unsigned  int  i  =  0;  i  <  num_bits;  i++)  { 
ans  =  ans  *  ans ; 
if  (mask  &  n) 
ans  =  ans  *  x ; 
mask  >>=  1; 

} 

return  ans ; 

} 


The  transformed  procedure  can  be  used  to  compute  the  desired  com¬ 
putation  to  nearly  full  accuracy  by  calling  the  procedure  FutureValue 
that  is  defined  below: 


float  FutureValue (float  alpha,  unsigned  int  n)  { 
float  w[2]  =  {  1+alpha,  1  }; 

FloatV  v(2 ,  w) ; 

return  power (v,n) . divDiff Table [0]  [1] ; 

} 


We  now  present  some  empirical  results  that  illustrate  the  advan¬ 
tages  of  this  approach.  In  this  experiment,  we  performed  computations 
using  single-precision  floating-point  arithmetic  on  a  Sony  VAIO  PCG- 
Z505JSK  (650  MHz  Intel  Pentium  III  processor)  running  Windows 
2000.  Programs  were  compiled  with  Microsoft  Visual  C++  6.0.  The 
table  given  below  shows  the  future  values  of  $1  deposits  made  for  360 
months,  as  computed  via  Eqn.  (13)  versus  procedure  FutureValue, 
for  a  variety  of  interest  rates.  (Places  where  the  results  from  the  two 
methods  differ  are  indicated  in  boldface.) 
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Interest  Rate 

Eqn.  (13) 

Procedure  FutureValue 

10% 

2260.5 

2260.49 

8% 

1490.36 

1490.36 

6% 

1004.52 

1004.52 

4% 

694.045 

694.048 

2% 

492.713 

492.722 

1% 

419.656 

419.632 

0.8% 

406.692 

406.718 

0.6% 

394.282 

394.323 

0.4% 

382.4 

382.422 

0.2% 

370.934 

370.986 

0.1% 

365.354 

365.438 

0.08% 

364.142 

364.34 

0.06% 

362.873 

363.247 

0.04% 

362.595 

362.165 

0.02% 

361.505 

361.08 

0.01% 

360.961 

360.54 

0.008% 

360.882 

360.432 

0.006% 

360.751 

360.324 

0.004% 

360.668 

360.216 

0.002% 

360.56 

360.108 

0.001% 

360.489 

360.054 

0.0008% 

386.238 

360.046 

0.0006% 

343.323 

360.031 

0.0004% 

386.238 

360.023 

0.0002% 

257.492 

360.008 

0.0001% 

514.984 

360.008 

0.00008% 

643.73 

360.008 

0.00006% 

0 

360 

0.00004% 

0 

360 

0.00002% 

0 

360 

□ 
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6.  A  Special  Case 


A  divided-difference  table  for  a  function  F  can  be  thought  of  as  a 
(redundant)  representation  of  an  interpolating  polynomial  for  F.  For 
instance,  if  you  have  a  divided-difference  table  T  (and  also  know  the 
appropriate  vector  of  values  xq,  x\.  . . .,  xn ) .  you  can  explicitly  construct 
the  Newton  form  of  the  interpolating  polynomial  for  F  according  to  the 
following  definition  (Conte  and  de  Boor,  1972,  pp.  197): 


n 

Pn{x )  =  ^F[x  o,.. 
i= 0 


i—  1 

,xi]  *  _  xj) 

3=0 


(15) 


Note  that  to  be  able  to  create  the  Newton  form  of  the  interpolating 
polynomial  for  F  via  Eqn.  (15),  only  the  first  row  of  divided-difference 
table  T  is  required  to  be  at  hand — i.e.,  the  values  F[x o,...,£j],  for 
0  <  i  <  n — together  with  the  values  of  x'o,  x'i,  . . .,  xn.  This  observa¬ 
tion  suggests  that  we  should  develop  an  alternative  divided-difference 
arithmetic  that  builds  up  and  manipulates  only  first  rows  of  divided- 
difference  tables.  We  call  this  divided-difference  arithmetic  FloatDDRl 
(for  Divided-Difference  Row  1).  The  motivation  for  this  approach  is 
that  FloatDDRl  operations  will  be  much  faster  than  FloatDD  ones,  be¬ 
cause  FloatDD  operations  must  manipulate  upper-triangular  matrices, 
whereas  FloatDDRl  operations  involve  only  simple  vectors. 

To  achieve  this,  we  define  class  FloatDDRl  as  follows: 
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class  FloatDDRl  { 

friend  FloatDDRl&  operator+  (const  FloatDDRl  &,  const  float); 
friend  FloatDDRl&  operator+  (const  FloatDDRl  &,  const  FloatV  &)  ; 
friend  FloatDDRl&  operator+  (const  float ,  const  FloatDDRl  &) ; 
friend  FloatDDRl&  operator+  (const  FloatV  &,  const  FloatDDRl  &)  ; 
friend  FloatDDRl&  operator-  (const  FloatDDRl  &,  const  float); 

friend  FloatDDRl&  operator-  (const  FloatDDRl  &,  const  FloatV  &) ; 

friend  FloatDDRl&  operator-  (const  float ,  const  FloatDDRl  &) ; 

friend  FloatDDRl&  operator-  (const  FloatV  &,  const  FloatDDRl  &) ; 
friend  FloatDDRl&  operator*  (const  FloatDDRl  &,  const  float); 

friend  FloatDDRl&  operator*  (const  FloatDDRl  &,  const  FloatV  &) ; 

friend  FloatDDRl&  operator*  (const  float ,  const  FloatDDRl  &) ; 

friend  FloatDDRl&  operator*  (const  FloatV  &,  const  FloatDDRl  &) ; 
friend  FloatDDRl&  operator/  (const  FloatDDRl  &,  const  float); 

friend  FloatDDRl&  operator/  (const  FloatDDRl  &,  const  FloatV  &) ; 

public : 
int  numPts; 

float  *divDif f Table ;  //  One-dimensional  array  of  divided  differences 
FloatDDRl (int  N)  //  Construct  a  zero-valued  FloatDDRl  of  length  N 
:  numPts (N) ,  divDif f Table (new  float [numPts] ) 

{  } 

FloatDDRl&  operator*  (const  FloatDDRl  &)  const;  //  binary  addition 

FloatDDRl&  operator-  (const  FloatDDRl  &)  const;  //  binary  subtraction 

}; 


Compared  with  class  FloatDD,  class  FloatDDRl  is  somewhat  impover¬ 
ished:  we  can  add  or  subtract  two  arbitrary  FloatDDRl’s;  however, 
because  we  do  not  have  full  divided-difference  tables  available,  we 
cannot  multiply  two  arbitrary  FloatDDRl’s;  nor  do  we  have  the  full 
A[x o,-,x„]  matrices  that  are  used  at  each  occurrence  of  the  independent 
variable.  We  finesse  these  difficulties  by  limiting  the  other  operations 
of  class  FloatDDRl  to  those  defined  by  the  friend  functions  indicated 
in  the  class  definition  given  above:  (i)  addition,  subtraction,  and  mul¬ 
tiplication  on  either  side  by  a  float  or  a  FloatV;  (ii)  division  on  the 
right  by  a  float  or  a  FloatV. 

The  operations  that  involve  a  float  argument  c  have  their  “obvi¬ 
ous”  meanings,  if  one  bears  in  mind  that  a  float  value  c  serves  as  a 
stand-in  for  a  full  matrix  c*I.  For  the  addition  (subtraction)  operations, 
c  is  only  added  to  (subtracted  from)  the  divDif  f  Table  [0]  entry  of  the 
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FloatDDRl  argument.  For  the  multiplication  (division)  operations,  all 
of  the  divDiffTable  entries  are  multiplied  by  (divided  by)  c. 

In  the  operations  that  involve  a  FloatV  argument,  the  FloatV  value 
serves  as  a  stand-in  for  a  full  ^4[XOv..;Xn]  matrix.  For  instance,  the  op¬ 
erator  for  multiplication  on  the  right  by  a  FloatV  can  be  thought 
of  as  performing  a  form  of  matrix  multiplication — but  specialized  to 
produce  only  the  first  row  of  the  output  divided-difference  table  (and 
to  use  only  values  that  are  available  in  the  given  FloatDDRl  and  FloatV 
arguments) : 


FloatDDRl&  operator*  (const  FloatDDRl  &fddrl,  const  FloatV  &fv){ 
FloatDDRl  *ans  =  new  FloatDDRl (fddrl .numPts) ; 
ans->divDiff Table [0]  =  fddrl .divDiffTable  [0]  *  fv.val[0]; 
for  (int  c  =  1;  c  <  fddrl .numPts ;  C++)  { 
ans->divDiff Table [c]  = 

fddrl .divDiff Table [c-1]  +  fddrl .divDiff Table [c]  *  fv.val[c]; 

} 

return  *ans ; 

} 


It  might  be  thought  that  the  operator  for  multiplication  on  the  left 
by  a  FloatV  does  not  have  the  proper  values  available  in  the  given 
FloatV  and  FloatDDRl  arguments  to  produce  the  first  row  of  the 
product  divided-difference  table  as  output.  (In  particular,  the  second 
argument,  which  is  of  type  FloatDDRl,  is  a  row  vector,  yet  we  want  to 
produce  a  row  vector  as  the  result.)  However,  it  is  easy  to  show  that 
divided-difference  matrices  are  commutative: 


F^*G^={F*G)^=(G*F)^  =  G^*F^.  (16) 

Consequently,  the  operator  for  multiplication  on  the  left  by  a  FloatV 
can  be  treated  as  if  the  FloatV  were  on  the  right: 


FloatDDRl&  operator*  (const  FloatV  &fv,  const  FloatDDRl  &fddrl){ 
return  fddrl  *  fv; 

} 


As  with  class  FloatDD,  the  division  operator  is  implemented  using  a 
form  of  back  substitution — specialized  here  to  compute  just  what  is 
needed  for  the  first  row  of  the  divided-difference  table: 
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FloatDDRl&  operator/  (const  FloatDDRl  &fddrl,  const  FloatV  &fv){ 
FloatDDRl  *ans  =  new  FloatDDRl (fddrl .numPts) ; 
ans->divDiff Table [0]  =  fddrl .divDiff Table [0]  /  fv.val[0]; 
for  (int  c  =  1;  c  <  fddrl .numPts ;  C++)  { 
ans->divDiff Table [c]  = 

(fddrl. divDiff Table [c]  -  ans->divDiff Table [c-1] )  /  fv.val[c]; 

} 

return  *ans ; 

} 


Because  only  a  limited  set  of  arithmetic  operations  are  available 
for  objects  of  class  FloatDDRl,  this  divided-difference  arithmetic  can 
only  be  applied  to  procedures  that  have  a  certain  special  syntactic 
structure,  namely  ones  that  are  “accumulative”  in  the  independent 
variable  (with  only  “right-accumulative”  quotients).  In  other  words, 
the  procedure  must  never  perform  arithmetic  operations  (other  than 
addition  or  subtraction)  on  two  local  variables  that  both  depend  on 
the  independent  variable. 

EXAMPLE  6.1.  The  procedure  Poly:  :Eval  for  evaluating  a  polyno¬ 
mial  via  Horner’s  rule  is  an  example  of  a  procedure  of  the  right  form. 
Consequently,  an  overloaded  version  of  the  function  Poly :  :  Eval  using 
FloatDDRl  arithmetic  can  be  written  as  shown  below  on  the  right: 

//  Evaluation  via  Horner’s  rule 

float  Poly: : Eval (float  x)  { 
float  ans  =  0.0; 

for  (int  i  =  degree;  i  >=  0;  i — ){ 
ans  =  ans  *  x  +  coeff[i]; 

} 

return  ans ; 

} 

//  Evaluation  via  Horner’s  rule 

FloatDDRl  Poly: : Eval (const  FloatV  &x) { 

FloatDDRl  ans (x. numPts) ;  //  ans  =  0.0 
for  (int  i  =  degree;  i  >=  0;  i — ){ 
ans  =  ans  *  x  +  coeff[i]; 

} 

return  ans ; 

} 
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In  Sect.  3,  Example  3.1  discussed  the  procedure  Poly:  :Eval_lDD,  a 
transformed  version  of  Poly :  :Eval  that  computes  the  value  of  the  first 
divided  difference  of  a  polynomial  with  respect  to  two  values,  x0  and 
With  the  way  that  the  overloaded  operations  are  defined  for  class 
FloatDDRl,  when  the  actual  parameter  supplied  for  x  is  a  FloatV  of 
length  two  consisting  of  x0  and  x1;  the  procedure 

FloatDDRl  Poly: :Eval (const  FloatV  &x) 

performs  essentially  the  same  steps  as  Poly:  :Eval_lDD. 

One  slight  difference  is  that,  in  addition  to  returning  the  value  of  the 
first  divided  difference,  the  FloatDDRl  version  also  returns  the  result 
of  evaluating  the  polynomial  on  x0. 

Another  difference  is  that  with  class  FloatDDRl,  because  of  our  trick 
for  handling  multiplication  by  a  FloatV  on  the  left  (cf.  Eqn.  16  and 
the  discussion  that  follows),  FloatDDRl  arithmetic  can  be  used  with 
programs  that  contain  multiplications  by  the  independent  variable  x  on 
the  left  as  well  as  on  the  right.  (The  transformation  used  in  Example  3.1 
could  also  be  enhanced  in  this  fashion.)  □ 

Some  empirical  results  that  illustrate  the  advantages  of  FloatDDRl 
arithmetic  in  a  useful  application  are  presented  at  the  end  of  Sect.  8. 

As  with  the  methods  discussed  in  Sects.  3  and  5,  FloatDDRl  arith¬ 
metic  can  be  used  to  produce  values  of  interest  for  computational 
differentiation.  For  instance,  suppose  we  have  transformed  procedure 
F: 


float  F (float  x)  ;  =>•  FloatDDRl  F (const  FloatV  &x)  ; 


When  all  of  the  xx  values  in  the  actual  parameter  supplied  for  FloatV 
x  are  the  same  value,  say  x,  then  the  FloatDDRl  value  returned  as  the 
output  holds  the  Taylor  coefficients  for  the  expansion  of  F  at  x  (cf.  For¬ 
mula  (12)).  Thus,  the  FloatV  divided-difference  arithmetic  generalizes 
previously  known  techniques  for  producing  accurate  Taylor  coefficients 
for  functions  defined  by  programs  (Rail,  1983;  Rail,  1984). 

If  we  attempt  to  use  FloatDDRl  arithmetic  in  a  procedure  that 
is  not  “accumulative”  in  the  independent  variable,  with  only  “right- 
accumulative”  quotients,  the  overload-resolution  mechanism  of  the  C++ 
compiler  will  detect  and  report  a  problem.  This  is  illustrated  by  the 
following  example: 

EXAMPLE  6.2.  Returning  to  the  future- value  calculation  of  Exam¬ 
ple  5.4,  because  the  desired  quantity  in  Eqn.  (14)  is  merely  a  first 
divided  difference,  we  might  attempt  to  carry  out  the  computation  by 
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means  of  FloatDDRl  arithmetic — hoping  to  save  the  cost  of  computing 
and  storing  the  (1,1)  entry  of  the  divided-difference  table — using  the 
following  overloaded  version  of  procedure  power: 


FloatDDRl  power (FloatV  &x,  unsigned  int  n)  { 
unsigned  int  mask  =  1  <<  (num_bits  -  1) ; 

FloatDDRl  ans (CONST,  2,  1.0);  //  ans  =  1.0 
for  (unsigned  int  i  =  0;  i  <  num_bits;  i++)  { 

ans  =  ans  *  ans;  //  overload-resolution  fails  here 
if  (mask  &  n) 
ans  =  ans  *  x ; 
mask  >>=  1; 

} 

return  ans ; 

} 


However,  at  the  statement 

ans  =  ans  *  ans ; 

procedure  power  multiplies  two  local  variables  that  both  depend  on  the 
independent  variable  x.  Consequently,  the  FloatDDRl  version  of  power 
is  not  accumulative  in  the  independent  variable.9  In  the  case  of  the 
Microsoft  Visual  C++  compiler,  the  following  error  message  is  issued: 

binary  ’*’  :  no  operator  defined  which  takes  a  left-hand  operand 
of  type  ’class  FloatDDRl’ 


□ 


7.  Multi-Dimensional  Computational  Divided  Differencing 

In  this  section,  we  explain  how  to  define  a  third  divided-differencing 
arithmetic  that  extends  our  techniques  to  handle  multi-dimensional 
computational  divided  differencing  (i.e. ,  computational  divided  differ¬ 
encing  of  functions  of  several  variables) . 

9  Because  the  (1,1)  entry  of  ans  is  always  1.0  in  the  invocation  of  power  from 
FutureValue,  a  version  of  power  specialized  to  this  context  would  be  able  to  save  the 
cost  of  computing  and  storing  the  (1, 1)  entry.  This  example  illustrates  a  limitation 
of  an  approach  based  solely  on  overloading;  however,  in  principle,  this  limitation 
could  be  overcome  using  sufficiently  powerful  partial-evaluation  methods. 
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7.1.  Background  Discussion 

As  background  to  the  material  that  will  be  discussed  in  Sect.  7.2, 
let  us  reiterate  a  few  points  concerning  the  divided-difference  tables 
that  result  from  computational  divided  differencing  of  functions  of  a 
single  variable.  In  the  following  discussion,  we  assume  that  the  divided- 
difference  table  in  question  has  been  constructed  with  respect  to  some 
known  collection  of  values  xq,  x\,  . . .,  xn. 

As  mentioned  at  the  beginning  of  Sect.  6,  a  divided-difference  table 
can  be  thought  of  as  a  (redundant)  representation  of  an  interpolating 
polynomial.  For  instance,  if  you  have  a  divided-difference  table  T  (and 
know  the  appropriate  vector  of  values  xq,  x\,  . . .,  xn ,  as  well),  you  can 
explicitly  construct  the  interpolating  polynomial  in  Newton  form  by 
using  the  values  in  the  first  row  of  T  in  accordance  with  Eqn.  (15). 
One  of  the  consequences  of  this  point  is  so  central  to  what  follows  in 
Sect.  7.2  that  it  is  worthwhile  to  state  it  explicitly  and  to  introduce 
some  helpful  notation: 

OBSERVATION  7.1.  [Representation  Principle].  A  divided-difference 
table  T  is  a  finite  representation  of  a  function  Func{Tj  defined  by 
Eqn.  (15).  (Note  that  if  F  —  Func\T\,  then  T  —  F^.) 

Given  two  divided- difference  tables,  T\  and  T-2,  that  are  defined  with 
respect  to  the  same  set  of  points  xo,  x\,  . . .,  xn,  the  operations  of  matrix 
addition,  subtraction,  multiplication,  and  division  applied  to  T\  and 
T-2  yield  representations  of  the  sum,  difference,  product,  and  quotient, 
respectively,  of  Func\F\\  arid  EVncpa].  □ 

In  other  words,  the  operations  of  class  FloatDD  provide  ways  to 

(i)  instantiate  representations  of  functions  of  one  variable  (by  evaluat¬ 
ing  programs  in  which  floats  have  been  replaced  by  FloatDDs),  and 

(ii)  perform  operations  on  function  representations  (i.e. ,  by  addition, 
multiplication,  etc.  of  FloatDD  values). 

It  is  also  worthwhile  restating  the  Computational  Divided-Differencing 
Principle  (Observation  5.2),  adding  the  additional  remark  given  in  the 
second  paragraph: 

OBSERVATION  7.2.  [Computational  Divided-Differencing  Principle 
Redux],  Rather  than  computing  a  divided-difference  table  with  respect 
to  the  points  xq,  x\,  . . .,  xn  by  invoking  the  program  n  +  1  times  and 
then  applying  Eqns.  (8)  and  (9),  we  may  instead  evaluate  the  program 
(once)  using  a  divided- difference  arithmetic  that  overloads  arithmetic 
operations  as  matrix  operations,  substituting  A^  ^  for  each  occur¬ 
rence  of  the  formal  parameter  x,  and  c  *  I  for  each  occurrence  of  a 
constant  c. 
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Furthermore,  this  principle  can  be  applied  to  divided- difference  tables 
for  functions  on  any  field  (because  addition,  subtraction,  multiplica¬ 
tion,  and  division  operations  are  required,  together  with  additive  and 
multiplicative  identity  elements).  □ 


7.2.  Computational  Divided  Differencing  of  Functions  of 
Several  Variables 

We  now  consider  the  problem  of  defining  an  appropriate  notion  of  di¬ 
vided  differencing  for  a  function  F  of  several  variables.  Observation  7.1 
provides  some  guidance,  as  it  suggests  that  the  generalized  divided- 
difference  table  for  F  that  we  are  trying  to  create  should  also  be  thought 
of  as  a  representation  of  a  function  of  several  variables  that  interpolates 
F.  Such  a  generalized  computational  divided-differencing  technique  will 
be  based  on  the  combination  of  Observations  7.1  and  7.2. 

Because  we  have  already  used  the  term  higher-order  to  refer  gener- 
ically  to  second,  third,  ...,  nth  divided  differences,  we  use  the  term 
higher-kind  to  refer  to  the  generalized  divided-difference  tables  that 
arise  with  functions  of  several  variables.  In  the  remainder  of  this  sec¬ 
tion,  we  make  use  of  an  alternative  notation  for  the  divided-difference 
operator  ,*«]. 


DD[» . ...jIUl  = 

We  use  DD1  [F]  when  the  xt  are  understood,  and  abbreviate  ranges 
of  variables  in  the  usual  way,  e.g.,  DD^  ^F]  =  DDf^^^F]  = 

F^[xo,xi,x2,x3]_  rp}ie  notation  DD 1  [•]  refers  to  divided-difference  tables 
of  kind  1  (the  kind  we  are  already  familiar  with  from  Sect.  5,  namely 
FloatDD  values).  Below,  we  use  DD2[-]  to  refer  to  divided-difference 
tables  of  kind  2;  in  general,  we  use  DDfe[-]  to  refer  to  divided-difference 
tables  of  kind  k. 

To  understand  the  basic  principle  that  underlies  our  approach,  con¬ 
sider  the  problem  of  creating  a  surface  that  interpolates  a  two- variable 
function  F(x,y)  with  respect  to  a  grid  formed  by  three  coordinate 
values  x'o,  aq,  a,' 2  in  the  .x-dimension.  and  four  coordinate  values  yo ,  y\. 
y-2 ,  2/3  in  the  y-dimension.  The  clearest  way  to  explain  the  technique  in 
common  programming-language  terminology  involves  currying  F.  That 
is,  instead  of  working  with  F  :  float  x  float  — >  float ,  we  work  with 
F  :  float  — >  float  — >  float.  We  can  create  (a  representation  of)  an 
interpolating  surface  for  F  (i.e.,  a  divided-difference  table  of  kind  2, 
denoted  by  :i  [F ] )  by  building  a  divided-difference  table  of 
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kind  1  using  the  functions  F(x o),  F(x i),  and  F(x 2),  each  of  which  is 
of  type  float  — >  float ,  as  the  “interpolation  points”.10 

Note  that  this  process  requires  that  we  be  capable  of  performing  ad¬ 
dition,  subtraction,  multiplication,  and  division  of  functions.  However, 
each  of  the  functions  F(x 0),  F{x\):  and  F(x 2)  is  itself  a  one-argument 
function  for  which  we  can  create  a  representation,  namely  by  build¬ 
ing  the  divided-difference  tables  DD12/o  ^[F(fc0)],  DD^^F^i)],  and 

[3/0,3]  [-^(^2)]  (with  respect  to  the  coordinate  values  yij.  y\ ,  y2-  and 
3/3).  By  Observation  7.2,  the  arithmetic  operations  on  functions  -FX^o), 
F(x  1),  and  F(x 2)  needed  to  create  DDj^  ^  ^  [_F]  can  be  carried 

out  by  performing  matrix  operations  on  the  matrices  DD^j  P^o)], 
DDto,3]P(^)l,  and  DD[yo,3]  ) J •  For  instance, 

=  (i7) 

Xq-Xi 


In  what  follows,  it  is  convenient  to  express  functions  using  lambda 
notation  (i.e. ,  in  Xz.exp.  z  is  the  name  of  the  formal  parameter,  and  exp 
is  the  function  body).  For  instance,  Xx.Xy.x  denotes  the  curried  two- 
argument  function  (of  type  float  — >  float  — >  float)  that  merely  returns 
its  first  argument.  For  our  purposes,  the  advantage  of  lambda  notation 
is  that  it  provides  a  way  to  express  the  anonymous  one-argument  func¬ 
tion  that  is  returned  when  a  curried  two-argument  function  is  supplied 
a  value  for  its  first  argument  (e.g.,  ( Xx.Xy.x)(xo )  returns  Xy.xo). 

In  short,  the  idea  is  that  a  divided-difference  table  of  kind  2  for 
function  F  is  a  matrix  of  matrices: 


DD 


[2:0,2],  [3/0, 3] 


m 


*1 


[xo,2]IA;C-DD  [2/0,3] 


DD 

I* 

dd^3][^o)J 


[^(z) 


DDL  3i  [^[2:0,  aa]] 
DDlo,3]^)l 


the  matrix  shown  in  Fig.  3 


DD>c  ■tfl'Xo-xi-xF  \ 
D°[,0  3][^^]J 
DD[»0,3]F(I2)]  ) 
(18) 


It  is  instructive  to  consider  some  concrete  instances  of  DD|o  ^  ^  PI 

for  various  F”s: 


10  This  idea  is  a  specific  instance  of  the  very  general  approach  to  surface  approx¬ 
imation  via  the  tensor-product  construction  given  in  (de  Boor,  1978,  Chap.  XVII). 
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1 

Cts 

CO 


H 

>P 


to 


F(x$HV0)  -F(*o)[j/0,l]  f(*o)[l/0,2]  -P(*o)[l/0,3] 
-P(*o)[!/l,2]  -P(*o)[!/l,3] 
F(xo)(y2)  -F(*o)[j/2,3] 

^C^oXys) 


Fxa,\'Ayc)  -F[*o,i][»o,i]  •p[*o,i][yo,2]  /  *o.i 

yi»0,l][»l,2]  -P[®0,l][j/1,3] 

-P[®0,l](!/2)  -P[*0,l][j/2,3] 

-F[*0,l](!J3) 

(JX^iXyo)  F(xiY.yc..\ F(xi)[yo,n]  -F(®i)[j/o,3] 

r/^lXyi)  F(xl)[yi,2] 

F(xl)(y2)  -F’(a31 )  [j/2,3] 

^(^1 )  (J/3) 


F[*0,2](yo)  -*?[a=0,2.]E*0,l]  -P[*0,2][»0,2]  F  [*0,2]  [j/0,3]  \  \ 
F[x0,2](yi)  ^[*0,^ltl/l,2]  -P’E*Q,2][Wl,a]  I 

-P[*0,2](j/2)  F  [*0,2]  [l/2,3]  I 
-P[*0,2](j/3)  / 

Flxi  ,2](.yo)  fL«i,a|jl/o,il  F[*i,2][yo,2]  r’[*i,2][yo,s]  \ 
F[xi,2](yi)  Jr[*i,#Iti/i,2]  ^fe^Hi/i.s]  I 
i?[*i,2](y2)  i-  xt.2i.v?..:t.  I 

-P[*l,2](j/3)  / 

(-P(*2)(yo)  -P(*2)[yo,i]  F(x2)[yo,2]  -P(*2)[yo,s]  \ 
F(x2)(yi)  F(x2)[yi,2]  -p(*2)[yi,3]  ] 

F(x2)(V2)  -F(*2)[y2,s]  I 

F(x2)(y3)  J  ) 


The  divided-difference  table  of  kind  2  for  funcfi°n 
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EXAMPLE  7.3.  Consider  the  function  Xx.Xy.x.  For  0  <  i  <  2,  we 
have 


DD[j/o,s]  Ay.ar)  =  DD^jAy.^] 


/  Xi 

V 


0  0  0  \ 
Xi  0  0 

Xi  0 

Xi  J 


and,  for  0  <  i  <  1,  we  have 
DDk  ^[(Az.Ay.zjfa^Zj+i]] 


=  DD 


=  DD 


[3/0,3]  I 


[3/0,3] 


=  DD 


Consequently, 


[3/0,3] 

/  1  0  0  0  \ 
1  0  0 
1  0 
\  1 ) 


(  /  x  0  0  0  0 
xo  0  0 
xo  0 
so 


(Xx.Xy.x)(xj)  -  (Xx.Xy.x)(xi+i) . 

Xi  -  xi+i 
„Xy.Xj  -  Xy.xi+i 
Xi  -  xi+i 

[Ay-1] 


DDfso,2],[3/o,3][Aa:-Al/-;rl  = 


1000 
100 
1  0 
1 

xi  0  0  0 
xi  0  0 
xi  0 
xi 


/  0 

0 

0 

0 

0 

0 

0 

0 

0 

V 

0 

(  1 

0 

0 

0 

1 

0 

0 

1 

0 

V 

1 

X2 

0 

0 

0 

X2 

0 

0 

X2 

0 

X2 

(19) 

□ 

EXAMPLE  7.4.  Consider  the  function  Xx.Xy.y.  For  0  <  i  <  2,  we 
have 


DDbo,3]  I(Aa:;- A2/-2/)  )1  =  DDL.3l!A^l  = 


[2/0,3]  II 


/  yo  1  0  0  \ 

yi  1  0 

V2  1 

V  V3  J 


and,  for  0  <  i  <  1,  we  have 
^[y0!3]l(^x.Xy.y)[xi,Xi+1}J  =  DD^a][ 


AXx.Xy.y)(xj)  -  {Xx.Xy.y){xi+1) 
Xi  -  xi+i 
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=  DD 


1 

[2/0,3] 


Ay-1/  ~  A y.y 

X{  %i-\- 1 


-  DD[3/o,3]^-°1 


/  0  0  0  0 
000 
0  0 

V  0 


\ 


Consequently,  we  have 


/  /  3/o  1  0  0 

I  3/1  1  0 

l  3/2  1 

\  3/3 


DD 


2 

[X0,2],[»0,3] 


{Xx.Xy.yJ 


\ 


(0000 
000 
0  0 
0 

3/0  1  0  0 

1/1  1  0 

3/2  1 


3/3 


□ 


Moreover,  Observation  7.2  tells  us  that  divided-difference  tables  for 
functions  of  two  variables  can  be  built  up  by  means  of  a  divided- 
difference  arithmetic  that  operates  on  matrices  of  matrices.  That  is, 
we  can  build  up  divided-difference  tables  of  kind  2  for  more  complex 
functions  of  x  and  y  by  using  operations  on  matrices  of  matrices,  sub¬ 
stituting  DD2|Ax.Ay.x]  for  each  occurrence  of  the  formal  parameter  x 
in  the  function,  and  DD2[Ax.Ay.y]  for  each  occurrence  of  the  formal 
parameter  y. 


EXAMPLE  7.5.  For  the  function  Xx.X y.(x  x  y),  DD2[Aa:.A y.(x  x  y)J 
can  be  created  by  multiplying  the  matrices  DD2[Aa;.Ai/.a:]  and  DD2|Aa:.Ai/.i/] 
from  Eqns.  (19)  and  (20),  respectively: 


DD2|Ax.A y.(x  x  y)j  =  DD2[(Aa:.Ay.a:)  x  (Xx.Xy.y)}  (21) 
=  DD2|Aa3.Ay.a:]  x  DD2[Aa:.Ai/.i/] 
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Note,  for  example,  that  the  (0,1)  entry  in  the  above  matrix,  namely 

(  Vo  1  0  0  \ 

yi  i  o 

2/2  i 

V  2/3  / 


was  obtained  via  the  calculation 


+ 


+ 


/  xo  0  0 

Xq  0 
Xq 


0\ 

0 


(  0 


V 

/  1  0  0  0  \ 
1  0  0 
1  0 

V  1  / 

/  0  0  0  0  \ 

0  0  0 
0  0 

V  0  / 


Xo  J  \ 

f  yo  i 

x  yi 

\ 

/  0  0  0 
0  0 

x  0 

V 


0  0  0  \ 
0  0  0 
0  0 
0  / 
0  0  \ 

1  0 
V2  1 
1/3  J 

0  \ 

0 

0 

0  / 


(22) 


and  not  by  the  use  of  Eqn.  (17),  which  involves  a  matrix  subtraction, 
a  scalar  subtraction,  and  a  scalar  division: 


DDLs]K^-(*  X 


DDjoJ^O  ><  1/)1  ~  DDL,3]N.(n  X  y)] 
Xq  -  Xi 


(  x0y0  x0  0  0  ^ 

xoyi  xo  0 

X0V2  x0 

V  x0y3  ) 


l  xiyo  xi  0  0  \ 

xiyi  x i  0 

xiy-2  xi 

V  xiy3  j 


xo  -  Xi 


(23) 


Expressions  (22)  and  (23)  are  equivalent  over  real  numbers,  but  not 
over  floating-point  numbers.  By  sidestepping  the  explicit  subtraction 
and  division  operations  in  expression  (23),  expression  (22)  avoids  the 
potentially  disasterous  magnification  of  round-off  error  that  can  occur 
with  floating-point  arithmetic.  □ 


The  principle  illustrated  in  Example  7.5  gives  us  the  machinery 
that  we  need  to  perform  computational  divided  differencing  for  bivari¬ 
ate  functions  defined  by  programs.  As  usual,  computational  divided 
differencing  is  performed  by  changing  the  types  of  formal  parame¬ 
ters,  local  variables,  and  return  values  to  the  type  of  an  appropriate 
divided-difference  arithmetic. 
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Furthermore,  these  ideas  can  be  applied  to  a  function  F  with  an 
arbitrary  number  of  variables:  when  F  has  k  variables,  DD^F],  F' s 
divided-difference  table  of  kind  k,  is  a  matrix  of  matrices  of  . . .  of 
matrices  nested  to  depth  k.  Currying  with  respect  to  the  first  parameter 
of  F  “peels  off”  one  dimension;  DD/s  |+]  is  a  matrix  whose  entries 
are  divided-difference  tables  of  kind  k  —  1  (i.e. ,  matrices  of  matrices 
of  . . .  of  matrices  nested  to  depth  k  —  1).  For  instance,  the  diagonal 
entries  are  the  divided-difference  tables  of  kind  k  —  1  for  the  (k  — 
l)-parameter  functions  F(x o),  F(x i),  ...,  F(xn)  (i.e.,  DD*-1^^)]], 
DD*-1[F(a;1)],..,DD*-1[ii’(a:fl)]). 

To  implement  this  approach  in  C++,  we  define  two  classes  and  one 
class  template: 


—  Class  template  template  <int  k>  class  DivDiffArith  can  be 
instantiated  with  a  value  k  >  0  to  represent  divided-difference 
tables  of  kind  k.  Each  object  of  class  DivDif  f  Arith<k>  has  links 
to  sub-objects  of  class  DivDiff  Arith<k-1>. 


—  Class  DivDif  f  Arith<0>  represents  the  base  case;  DivDif  f  Arith<0> 
objects  simply  hold  a  single  float. 


—  Class  IntVector,  a  vector  of  int’s,  is  used  to  describe  the  number 
of  points  in  each  dimension  of  the  grid  of  coordinate  points. 


Excerpts  from  the  definitions  of  these  classes  are  shown  below: 


template  <int  k>  class  DivDiffArith  { 
public: 
int  numPts; 

DivDiff Arith<k-1>  **divDiff Table;  //  Two-dimensional  upper-triangular  array 
DivDiff Arith(const  FloatV  &v,  const  IntVector  &grid,  int  d) ; 

DivDiff Arith(float ,  const  IntVector  &grid) ;  //  constant;  shape  conforms  to  grid 
DivDiff Arith(float ,  const  DivDiff Arith<k-1>  fedda) ;  //  constant;  shape  conforms  to  dda 
DivDiff Arith<k>&  operator+  (const  DivDiff Arith<k>  &)  const;  //  binary  addition 

DivDiff Arith<k>&  operator-  (const  DivDiff Arith<k>  &)  const;  //  binary  subtraction 

DivDiff Arith<k>&  operator*  (const  DivDiff Arith<k>  &)  const;  //  binary  multiplication 

DivDiff Arith<k>&  operator/  (const  DivDiff Arith<k>  &)  const;  //  binary  division 
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class  DivDif f Arith<0>  { 
public: 
float  value; 

DivDiff Arith(float  v  =  0.0);  //  Default  constructor 
DivDiff Arith(const  FloatV  &v,  const  IntVector  &grid, 
DivDiff Arith<0>&  operator+  (const  DivDiff Arith<0>  &) 
DivDiff Arith<0>&  operator-  (const  DivDiff Arith<0>  &) 
DivDiff Arith<0>&  operator*  (const  DivDiff Arith<0>  &) 
DivDiff Arith<0>&  operator/  (const  DivDiff Arith<0>  &) 


int  d) ; 
const;  // 
const;  // 
const;  // 
const;  // 


binary  addition 
binary  subtraction 
binary  multiplication 
binary  division 


class  IntVector  { 
public: 
int  numPts; 

int  *val;  //  An  array  of  values:  val [0] . .val [numPts- 1] 
IntVector  () ; 

IntVector (int  N,  ...);  //  Construct  IntVector  given  N  values 
IntVector&  operator<<  (const  int  i) ;  //  left  shift 

}; 


The  operations  of  class  DivDif  f  Arith<k>  are  overloaded  in  a  fashion 
similar  to  those  of  class  FloatDD.  (Class  FloatDD  is  essentially  identi¬ 
cal  to  DivDiff Arith<l>.)  For  instance,  the  overloaded  multiplication 
operator  performs  matrix  multiplication: 


template  <int  k> 

DivDif fArith<k>&  DivDif fArith<k> :: operator*  (const  DivDif fArith<k>  &dda)  const{ 
assert (numPts  ==  dda. numPts) ; 

DivDiff Arith<k>  *ans  =  new  DivDif fArith<k> (numPts) ; 
for  (int  r  =  0;  r  <  numPts;  r++)  { 
for  (int  c  =  r;  c  <  numPts;  C++)  { 

DivDif fArith<k-l>  temp((f loat)0.0,  divDiff Table [r] [c] ) ;  //  temp  =  0.0 
for  (int  j  =  r;  j  <=  c;  j++)  { 

temp  +=  divDiff Table [r]  [j]  *  dda.divDif f Table [j]  [c]  ; 

} 

ans->divDiff Table [r] [c]  =  temp; 

} 

} 

return  *ans; 

} 


Class  DivDif  f  Arith<k>  has  two  constructors  for  creating  a  DivDif  f  Arith<k> 
object  from  a  float  constant.  They  differ  only  in  their  second  argu¬ 
ments  (an  IntVector  versus  a  DivDiff Arith<k>),  which  are  used  to 
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determine  the  appropriate  dimensions  to  use  at  each  level  in  the  nesting 
of  matrices. 

Suppose  that  in  the  function  on  which  computational  divided  dif¬ 
ferencing  is  to  be  carried  out,  variable  z  is  the  independent  variable 
associated  with  argument  position  d+1.  To  generate  an  appropriate 
DivDiff  Arith<k>  object  for  z  for  a  given  set  of  grid  values  z0,  . . zm, 
a  FloatV  with  the  values  z0,  . . .,  zm  is  created,  and  then  passed  to  the 
following  DivDiff Arith<k>  constructor: 

template  <int  k> 

DivDiff Arith<k> : :DivDiffArith (const  FloatV  &v,  const  IntVector  &grid,  int  d)  : 

numPt  s ( gr id . val [0]  ) , 

divDif f Table (calloc_ut<  k  > (numPt s)) 

{ 

assert (grid. val [d]  ==  v.numPts); 

IntVector  tail  =  grid  <<  1; 

for  (int  r  =  0;  r  <  numPts;  r++)  { 

for  (int  c  =  r;  c  <  numPts;  C++)  { 

divDif f Table [r] [c]  =  DivDiffArith<k-l>((float)0.0,tail) ; 

} 

} 

if (d  ==  0)  { 

DivDiff Arith<k-1>  one( (float) 1 .0,  tail); 

for  (int  i  =  0;  i  <  numPts;  i++)  { 

divDif f Table [i] [i]  =  DivDif fArith<k-l> (v. val [i] , tail) ; 
if(i  <  numPts  -  1)  { 

divDif f Table [i] [i+1]  =  one; 

} 

} 

} 

else  { 

for  (int  i  =  0;  i  <  numPts;  i++)  { 

divDif f Table [i] [i]  =  DivDiffArith<k-l>(v,tail,d-l) ; 

} 

} 

} 

EXAMPLE  7.6.  The  following  code  fragment  generates  two  DivDif  f  Arith<2> 
values,  x  and  y,  which  correspond  to  the  matrices  shown  in  Eqns.  (19) 
and  (20),  respectively: 

IntVector  grid(2 ,3 ,4) ; 

FloatV  fv_x(3,xo,xi ,X2) ; 

DivDif fArith<2>  x(fv_x,grid,0)  ;  //  argument  position  1 
FloatV  fv_y(4,y0,yi ,y2,y3)  ; 

DivDif fArith<2>  y (fv_y ,grid, 1) ;  //  argument  position  2 
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□ 


EXAMPLE  7.7.  Consider  a  C++  class  BivariatePoly  that  represents 
bivariate  polynomials,  and  a  member  function  BivariatePoly:  :Eval 
that  evaluates  a  polynomial  via  a  bivariate  version  of  Horner’s  rule: 


class  BivariatePoly  { 
public : 

float  Eval (float , float ) ; 
private : 

int  degree 1 ,degree2 ; 

//  Array  coef f [0 .. degree 1] [0 . .degree2] 
float  **coef f ; 

}; 

//  Evaluation  via  bivariate  Horner’s  rule 
float  BivariatePoly :: Eval (float  x, float  y){ 
float  ans  =  0.0; 

for  (int  i  =  degreel;  i  >=  0;  i — ){ 
float  temp  =  0.0; 

for  (int  j  =  degree2;  j  >=  0;  j — ){ 
temp  =  temp  *  y  +  coeff [i]  [j]  ; 

} 

ans  =  ans  *  x  +  temp; 

} 

return  ans ; 

} 


Similar  to  what  has  been  done  in  Examples  3.1,  4.1,  5.3,  and  6.1,  com¬ 
putational  divided  differencing  is  carried  out  on  this  version  of  Eval  by 
changing  the  types  of  its  formal  parameters,  local  variables,  and  return 
value  from  float  to  DivDiff Arith<2>. 
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//  Evaluation  via  bivariate  Horner’s  rule 

DivDif f Arith<2>  BivariatePoly: :Eval (const  DivDif fArith<2>  &x, 

const  DivDif fArith<2>  &y) 

{ 

DivDif fArith<2>  ans(0.0,x);  //  ans  =  0.0 
for  (int  i  =  degreel;  i  >=  0;  i--){ 

DivDif fArith<2>  temp(0.0,y);  //  temp  =  0.0 
for  (int  j  =  degree2;  j  >=  0 ;  j — ){ 
temp  =  temp  *  y  +  coeff [i]  [j]  ; 

} 

ans  =  ans  *  x  +  temp; 

} 

return  ans ; 

} 


To  use  this  procedure  to  create  a  divided-difference  table  of  kind  2  for 
a  given  variable  P  of  type  BivariatePoly*,  with  respect  to  the  3-by- 
4  grid  {x0,xi,x2}  x  {y0,yi,y2,y3},  we  would  generate  the  IntVector 
grid  and  DivDiff  Arith<2>  values  x  and  y  as  shown  in  Example  7.6, 
and  then  invoke 


P->Eval(x,y) ; 


□ 


One  final  point  concerning  this  approach:  it  is  worthwhile  noting 
that  by  generalizing  the  grid  descriptors  slightly,  it  is  possible  to  devise 
an  even  more  general  divided-differencing  arithmetic  that  is  hetero¬ 
geneous  in  shape  with  respect  to  different  argument  positions.  By 
“heterogeneous” ,  we  mean  that  full  two-dimensional  (upper-triangular) 
divided-difference  tables  could  be  provided  for  some  argument  posi¬ 
tions,  while  other  argument  positions  could  just  provide  a  single  row 
of  divided  differences  (i.e.,  one-dimensional,  FloatDDRl-like  tables).  By 
this  means,  when  a  procedure  body  is  “accumulative”  in  certain  of  its 
formal  parameters  but  not  others,  it  would  be  possible  to  tailor  the 
divided-differencing  version  of  the  procedure  to  improve  its  efficiency. 
(In  the  case  of  procedure 

DivDiff Arith<2>  BivariatePoly: :Eval, 

it  would  be  possible  to  specify  that  both  argument  positions  provide 
FloatDDRl-like  tables.) 
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8.  Paige’s  Work  on  Finite  Differencing  of  Computable 

Expressions 


In  earlier  sections,  and  also  in  Sect.  9,  we  have  attempted  to  place 
the  ideas  that  are  developed  in  this  paper  in  their  proper  context 
by  describing  how  they  relate  to  previous  work  on  computational- 
differentiation  (Wengert,  1964;  Rail,  1981;  Griewank  and  Corliss,  1992; 

Berz  et  al.,  1996;  Griewank,  2000)  and  to  previous  work  on  the  creation 
of  accurate  divided-difference  tables  for  expressions  (Opitz,  1964;  Mc¬ 
Curdy,  1980).  In  the  present  section,  we  discuss  some  additional  in¬ 
tellectual  forbearers  of  our  work,  focusing  particularly  on  how  our 
ideas  relate  to  Robert  Paige’s  work  on  finite  differencing  of  set-valued 
expressions  in  SETL  programs. 

Starting  in  the  mid-70s,  Paige  studied  how  finite-differencing  trans¬ 
formations  of  applicative  set-former  expressions  could  be  exploited  to 
optimize  loops  in  very-high-level  languages,  such  as  SETL.  References  (Paige, 
1981;  Paige  and  Koenig,  1982)  are  just  two  of  many  works  that  Paige 
wrote  about  this  subject,  and  these  ideas  were  implemented  in  his 
RAPTS  system  (Paige  and  Koenig,  1982;  Paige,  1983).  Some  of  the 
techniques  that  Paige  explored  have  their  roots  in  earlier  work  by 
Earley  (Earley,  1974;  Earley,  1976).  Independently  of  and  contempora¬ 
neously  with  Paige,  similar  loop-optimization  methods  targeted  toward 
very-high-level  set-theoretic  languages  were  investigated  by  Fong  and 
Ullman  (Fong  and  Ullman,  1976;  Fong,  1977;  Fong,  1979).  More  re¬ 
cently,  Liu  and  Stoller  have  used  some  extensions  of  these  ideas  to 
optimize  array  computations  (Liu  and  Stoller,  1998)  and  recursive  pro¬ 
grams  (Liu  and  Stoller,  2000).  Liu  et  al.  have  also  shown  how  such 
transformations  can  be  applied  to  derive  algorithms  for  incremental- 
computation  problems  (i.e. ,  problems  in  which  the  goal  is  to  main¬ 
tain  the  value  of  some  function  F(x)  as  the  input  x  undergoes  small 
changes)  (Liu,  1995;  Liu  and  Teitelbaum,  1995a;  Liu  and  Teitelbaum, 
1995b;  Liu  et  al.,  1996;  Liu  and  Stoller,  1999). 

Both  Paige  (Paige,  1981)  and  Paige  and  Koenig  (Paige  and  Koenig, 

1982)  provide  lengthy  discussions  of  the  roots  of  the  ideas  that  are 
developed  in  those  papers.  The  basic  idea  for  optimizing  SETL  loops  is 
described  as  a  generalization  of  strength  reduction,  an  idea  attributed 
to  John  Cocke  from  the  60s,  whereby  a  loop  is  transformed  so  that 
a  multiplication  operation  in  the  loop  is  eliminated  in  favor  of  an 
addition,  as  shown  below: 


paper.tex;  27/03/2001;  10:42;  p.51 


52 


T.W.  Reps  and  L.B.  Rail 


i  =  .  .  .  ; 
while  (...)  { 

. . .  i*c  . .  .  ; 
i  =  i  +  delta; 

} 


i  =  .  .  .  ; 

T  =  i*c;  //  T  depends  on  i 
deltaT  =  delta*c; 
while  (...)  { 

...  T  ...;  //  T  replaces  i*c 
i  =  i  +  delta;  //  change  to  i 
T  =  T  +  deltaT  //  update  of  T 

} 


This  transformation  improves  the  running  time  of  the  loop  if  the  cost  of 
the  additions  performed  by  the  transformed  loop  are  less  than  the  cost 
of  the  multiplications  performed  in  the  original  loop.  In  (Cocke  and 
Schwartz,  1970),  Cocke  and  Schwartz  presented  a  variety  of  strength- 
reduction  transformations  for  use  in  optimizing  compilers. 

Paige’s  work  on  loop  optimization  in  SETL  was  based  on  the  ob¬ 
servation  that  a  similar  transformation  could  be  applied  to  loops  that 
involve  set-former  expressions.  In  this  transformation,  an  expensive  set- 
former  expression  in  a  loop  is  replaced  by  a  set-initialization  statement 
(placed  before  the  loop)  together  with  a  set-update  operation  (placed 
inside  the  loop): 


A  =  ...; 
while  (...)  { 

...  {  x  G  A  |  x°/,2  ==  0  }  .  .  .  ; 
d  =  ...; 

A  =  A  U  d; 

} 


A  =  ...; 

T  =  {  x  €  A  |  x%2  ==  0  } ;  //  T  depends  on  A 
while  (...)  { 

...  T  ...;  //  T  replaces  the  set  former 
d  =  ...; 

A  =  A  U  d;  //  change  to  A 

if  (d%2  ==  0)  T  =  T  U  d;  //  update  of  T 

} 


In  the  transformed  program  shown  above  on  the  right,  the  expression 
{  x  E  A  |  x%2  ==  0  }  in  the  loop  is  replaced  by  a  use  of  T.  Because 
the  statement 


A  =  A  U  d; 

may  alter  the  value  of  variable  A,  just  after  this  statement  a  new 
statement  is  introduced: 

if  (d°/.2  ==  0)  T  =  T  U  d; 
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The  latter  statement  updates  the  value  of  variable  T  to  have  the  same 
value  that  the  expression  {  x  E  A  |  x%2  ==  0  }  has  when  evaluated 
with  the  new  value  of  variable  A.11 

Of  more  direct  relevance  to  the  topic  of  the  present  paper  is  the 
discussion  in  Paige’s  papers  in  which  he  points  out  affinities  between 
his  work,  on  the  one  hand,  and  differentiation  and  finite  differencing 
of  numerical  functions,  on  the  other  hand.12  For  instance,  Paige  and 
Koenig  describe  the  relationship  between  their  SETL  finite-differencing 
methods  and  numerical  finite-difference  methods  as  follows  (Paige  and 
Koenig,  1982,  pp.  403-404): 

It  is  interesting  to  note  that  the  origins  of  our  method  may  be 
traced  back  to  the  finite  difference  techniques  introduced  by  the 
English  mathematician  Henry  Briggs  in  the  sixteenth  century.  His 
method,  which  can  be  used  to  generate  a  sequence  of  polynomial 
values  p{xq),  p{x o  +  h),  p{x o  +  2 h),  . . .,  hinges  on  the  following 
idea.  For  a  given  polynomial  p(x)  of  degree  n  and  an  increment  h. 
the  first  difference  polynomial 

pi(x)  -  p(x  +  h )  -p(x) 

is  of  degree  n  —  1  or  less,  the  second  difference  polynomial 
p2(x)  =Pl(x  +  h )  -pi{x) 

is  of  degree  n  —  2  or  less,  . . .,  and,  finally,  pn(x)  must  be  a  constant. 
Thus,  to  tabulate  successive  values  of  p{x)  starting  with  x  —  xq, 
we  can  perform  these  two  steps: 

1.  Calculate  initial  values  for  p(x o),  Pi(xq),  . . .,  pn{x o)  and  store 

them  in  t(l),  i(2),  . . .,  t(n  +  1). 

2.  Generate  the  desired  polynomial  table  by  iterating  over  the 

following  code  block: 


11  The  code  fragment  if  (d"/,2  ==  0)  T  =  T  U  d;  is  called  a  post- derivative  of 
T  =  {  x  E  A  |  x‘/,2  ==  0  };  with  respect  to  the  change  A  =  A  U  d;.  Similarly,  a 
code  fragment  for  updating  variable  T  that  is  placed  before  the  change  A  =  A  U  d; 
is  called  a  pre-derivative. 

12  The  (forward)  finite  difference  of  a  function  F  with  respect  to  h  is  defined  as 
follows: 

A hF(x)  d=F(x  +  h)  -F(x). 
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print  x,  t(  1);  $  print  x  and  p(x) 

t(  1)  :=  t(l)  +  t( 2);  $  place  new  values  for 

t{ 2)  :=  t{ 2)  +  f(3);  $  p(x),  p,  {x\,  . . .,  pn-i(x) 

:  $  into 

t(n)  :=  t(n)  +  t(n  +  1);  $  t(  1),  t( 2),  . . t(n). 

x  :=  x  +  h;  $ 

Note  that  Briggs’s  method  requires  only  n  additions  in  step  2 
to  compute  each  new  polynomial  value,  while  Horner’s  rule  to 
compute  a  fresh  polynomial  value  costs  n  additions  and  n  mul¬ 
tiplications. 

They  relate  Briggs’s  method  to  strength  reduction  in  the  following 
passage  (Paige  and  Koenig,  1982,  pp.  404-405): 

Although  Cocke’s  technique  does  not  treat  polynomials  as  special 
objects,  strength  reduction  is  sufficiently  powerful  to  transform  a 
program  involving  repeated  calculations  of  a  polynomial  accord¬ 
ing  to  Horner’s  rule  into  an  equivalent  program  that  essentially 
uses  the  more  efficient  finite  difference  method  of  Briggs.  Indeed, 
this  is  a  surprising  and  important  result  that  demonstrates  that 
the  success  of  polynomial  evaluation  by  differencing  results  from 
properties  of  the  elementary  operations  used  to  form  polynomials 
rather  than  from  properties  exclusive  to  polynomials.  In  other 
words,  Cocke’s  method  works  because  the  following  distributive 
and  associative  laws  hold  for  sums  and  products: 

(i  ±  delta)  *  c  =>  i  *  c  ±  delta  *  c; 

(i  ±  delta)  +  c  =>  (i  +  c)  ±  delta. 

In  (Cocke  and  Schwartz,  1970)  Cocke  and  Schwartz  extend 
this  idea  to  show  how  reduction  in  strength  (which  we  call  finite 
differencing)  applies  to  a  wide  range  of  arithmetic  operations  that 
exhibit  appropriate  distributive  properties. 

Later  in  the  paper,  after  Paige  and  Koenig  have  introduced  their 
rules  for  finite  differencing  of  set-former  expressions  with  respect  to 
changes  in  argument  values  (an  operation  that  they  sometimes  call  “dif¬ 
ferentiation” ) ,  they  return  to  the  discussion  of  Briggs’s  method  (Paige 
and  Koenig,  1982,  pp.  421): 

Profitable  differentiation  of  an  expression  /  can  sometimes  be 
supported  by  differentiating  /  together  with  a  chain  of  auxiliary 
expressions  (as  in  Briggs’s  first,  second,  . . .,  difference  polynomials 
.  .  .  ).  Thus,  the  prederivative  V~E(x  +:=  delta])  of  the  nth  degree 
polynomial  E  —  P(x)  is 

E+:=P!(x) 
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where  P\{x)  is  the  first  difference  polynomial.  However,  for  the 
prederivative  code  above  to  be  inexpensive,  we  must  also  differenti¬ 
ate  the  second,  third,  . . .,  nth  difference  polynomials,  denoted  E,  — 
Pi(x),  i  —  2 ..n.  To  realize  Briggs’s  efficient  technique,  we  consider 
the  extended  prederivative  (of  expressions  ordered  carefully  into  a 
“differentiable  chain”)  \7~En_ i, . . . ,  E\,E(x  +:=  delta ;)  that  ex¬ 
pands  into 

E  +:=Ei; 

Ei  +:=  E2- 

En— 1  +: —  En , 

Essentially  all  of  the  material  that  Paige  and  Koenig  present  in  their 
paper  to  relate  their  work  to  Briggs’s  method  has  been  quoted  above. 
However,  a  few  details  about  the  derivation  of  Briggs’s  method  were 
not  spelled  out  in  their  treatment,  which  we  now  attempt  to  rectify. 
We  will  show  below  that  computational  divided  differencing  supplies  a 
clean  way  to  handle  an  important  step  in  the  derivation  for  which  what 
Paige  and  Koenig  say  is  ambiguous. 

The  initial  program  for  tabulating  a  polynomial  at  a  collection  of 
equally  spaced  points  can  be  written  in  C++  as  follows:13 


void  Poly :: Tabulate (float  start,  int  numPoints,  float  h) { 

float  x  =  start; 

float  y; 

for  (int  i  =  1;  i  <=  numPoints;  i++) 

{  //  Tabulation  loop 

y  =  Eval(x) ; 

cout  <<  x  <<  "  <<  y  <<  endl; 

x  +=  h; 

} 

} 

In  the  program  shown  above,  Eval  is  the  member  function  of  class  Poly 
that  evaluates  a  polynomial  via  Horner’s  rule  (see  Sect.  3),  of  type 

float  Poly: : Eval (float  x) ; 

13  It  should  be  pointed  out  that  in  practice  it  is  better  to  code  the  statement  in 
the  loop  that  changes  the  value  of  x  as  “x  =  start  +  i  *  h;”,  rather  than  as  “x  += 
h;”,  so  that  small  errors  in  h  do  not  accumulate  in  x  due  to  repeated  addition.  We 
have  chosen  to  use  the  latter  form  to  emphasize  the  similarities  between  Tabulate 
and  the  two  earlier  strength-reduction  examples. 
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The  intention  of  Paige  and  Koenig  is  to  transform  procedure  Poly :  :  Tabulate 
into  something  like  the  following  version: 

void  Poly: : Tabulate (float  start,  int  numPoints,  float  h){ 
float  x  =  start; 
float  y; 

float  E  =  Eval(start);  //  Call  on  Eval  hoisted  out  of  the  loop 
float  Ei  =  ???;  //  Unspecified  initialization 

float  En_i  =  ???; 
float  En  =  ???; 

for  (int  i  =  1;  i  <=  numPoints;  i++)  {  //  Tabulation  loop 
y  =  E; 

cout  <<  x  <<  "  <<  y  <<  endl; 

E  +=  Ei ;  //  Extended  pre-derivative  w.r.t.  x  +=  h; 

Ei  +=  E2; 

En— i  En  ; 

x  +=  h; 

} 

} 


However,  as  indicated  by  the  question  marks  in  the  above  code,  Paige 
and  Koenig  do  not  state  explicitly  how  they  plan  to  arrive  at  the  proper 
initialization  code  that  is  to  be  placed  just  before  the  loop  in  the 
transformed  program.  It  is  unclear  whether  they  intend  to  generate 
the  values  E1:  E2,  . . .,  En_1;  En  by  evaluating  polynomial  P  at  start, 
start+h,  ...,  start+(n-l)*h,  start+n*h  and  then  create  the  E^  via 
subtraction  operations,  or  whether  they  intend  to  generate  the  finite- 
difference  polynomials  Pi,  P2,  . . .,  Pn-i-  Pn  symbolically  and  then  apply 
each  of  them  to  start.  The  former  method  can  lead  to  very  inaccurate 
results  (see  below) ,  whereas  the  latter  method  requires  that  a  substan¬ 
tial  amount  of  symbolic  manipulation  be  performed  to  generate  the 
Pi. 

However,  the  divided-difference  arithmetic  FloatDDRl  gives  us  an 
easy  way  to  create  suitable  initialization  code  that  produces  accurate 
values  for  the  Ei.  This  initialization  code  consists  of  three  steps: 

—  Create  a  FloatV  of  equally  spaced  points,  starting  at  start  and 
separated  by  h,  where  the  number  of  points  is  one  more  than  the 
degree  of  the  polynomial. 
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—  Introduce  a  single  call  on  the  member  function 

FloatDDRl  Poly: :Eval (const  FloatV  &x)  ; 

to  create  the  first  row  of  the  divided-difference  table  for  the  poly¬ 
nomial  with  respect  to  the  given  FloatV. 

—  Convert  the  resulting  FloatDDRl  (which  holds  divided-difference 
values)  into  the  first  row  of  a  finite-difference  table  for  the  poly¬ 
nomial  by  multiplying  each  entry  by  an  appropriate  adjustment 
factor  (see  (Conte  and  de  Boor,  1972,  Lemma  4.1)). 

This  initialization  method  is  used  in  the  version  of  Tabulate  shown 
below.  (In  this  version  of  Tabulate,  the  Ei  are  renamed  diff  Table  [i]  .) 


void  Poly: : Tabulate (float  start,  int  numPoints,  float  h){ 
float  x  =  start; 
float  y; 

//  Create  accurate  divided-difference  table 
FloatV  fv(x,  degree+1,  h) ; 

FloatDDRl  fddrl  =  Eval(fv);  //  Calls  FloatDDRl  Poly: :Eval (const  FloatV  &) ; 

//  Convert  divided-difference  entries  to  finite-difference  entries 

float  *diff Table (new  float [degree+1] )  ; 

float  adjustment  =  1.0; 

for  (int  i  =  0;  i  <=  degree;  i++)  { 

dif f Table [i]  =  fddrl .divDiff Table [i]  *  adjustment; 
adjustment  *=  (h  *  (i+1)); 

} 

for  (int  i  =  1;  i  <=  numPoints;  i++)  {  //  Tabulation  loop 
y  =  dif f Table  [0] ; 
cout  <<  x  <<  ":  "  <<  y  <<  endl; 

for  (int  j  =  0;  j  <  degree;  j++)  {  //  Pre-derivative  w.r.t.  x  +=  h; 
diffTable[j]  +=  dif f Table  [j  +  1] ; 

} 

x  +=  h; 

} 

} 
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Empirical  Results:  Tabulation  of  a  Polynomial  via 
Briggs’s  Method 

We  now  present  some  empirical  results  that  illustrate  the  advantages  of 
the  final  version  of  Poly:  : Tabulate  presented  above.  Again,  we  work 
with  the  polynomial  P (x)  —  2.1  *a;3  —  1.4*;r2  —  .6  *£+ 1.1,  and  perform 
computations  using  single-precision  floating-point  arithmetic  on  a  Sun 
SPARCstation  20/61  running  SunOS  5.6.  Programs  were  compiled  with 
the  egcs-2.91.66  version  of  g++  (egcs-1.1.2  release)  with  optimization  at 
the  -01  level. 

The  final  version  of  Poly:  : Tabulate  uses  the  divided-difference 
arithmetic  FloatDDRl  in  the  initialization  step  that  creates  the  ini¬ 
tial  finite-difference  vector  diffTable.  An  alternative  way  to  gener¬ 
ate  diffTable  is  to  evaluate  polynomial  P  at  start,  start +h,  ..., 
start+(n-l)  *h,  start+n*h  and  then  create  diffTable  via  subtraction 
operations,  according  to  the  standard  definition  (Conte  and  de  Boor, 
1972,  pp.  214).  However,  the  latter  way  of  generating  diffTable  in¬ 
volves  subtraction  operations,  and  hence  may  magnify  any  round-off 
errors  in  the  n  +  1  values  computed  for  P.  In  contrast,  the  method 
using  computational  divided  differencing  yields  a  way  to  create  a  more 
accurate  initial  finite-difference  table. 

To  give  a  concrete  illustration  of  the  benefits,  the  following  table 
shows  what  happens  when  P(x )  is  evaluated  at  the  10,001  points  in 
the  interval  [0.0, 1.0]  with  a  grid  spacing  of  .0001.  (Places  where  the 
results  from  the  three  methods  differ  are  indicated  in  boldface.) 


Evaluate 

via  Horner 

Comp.  Div.  Diff. 

-1-  Briggs 

Standard  Finite  Diff. 

+  Briggs 

X 

P(x) 

P(x) 

P(x) 

0.0000 

1.10000 

1.10000 

1.10000 

0.0001 

1.09994 

1.09994 

1.09994 

0.0002 

1.09988 

1.09988 

1.09988 

0.9998 

1.19942 

1.19941 

19844.3 

0.9999 

1.19971 

1.19970 

19850.3 

1.0000 

1.20000 

1.19999 

19856.2 

Time 

7.64 

5.62 

5.49 

(milliseconds) 

The  numbers  that  appear  in  the  rightmost  column  for  P(.9998),  P(.9999), 
and  P(1.0000)  are  not  typographical  errors.  What  happens  is  that 
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round-off  errors  in  the  computation  of  the  initial  values  for  P  are  magni¬ 
fied  by  the  standard  method  for  computing  the  initial  finite-difference 
table,  and  after  10,000  iterations  of  the  Briggs  calculation,  the  accu¬ 
mulated  effect  has  caused  the  values  produced  to  diverge  widely  from 
the  correct  answers. 

Overall,  the  method  based  on  computational  divided  differencing  is 
far  more  accurate  than  the  one  in  which  the  vector  needed  for  Briggs’s 
method  is  obtained  by  subtraction  operations  (and  only  2%  slower). 
Furthermore,  the  results  from  the  method  based  on  computational 
divided  differencing  are  nearly  as  accurate  as  those  obtained  by  reeval¬ 
uating  the  polynomial  at  each  point,  but  the  reevaluation  method  is 
36%  slower. 


9.  Other  Related  Work 

Sect.  8  described  how  our  ideas  relate  to  Robert  Paige’s  work  on  finite 
differencing  of  set-valued  expressions  in  SETL  programs.  This  section 
concerns  other  related  work,  which  falls  into  four  categories: 

Computational  Differentiation 

Computational  differentiation  is  a  well-established  area  of  numerical 
analysis,  with  its  own  substantial  literature  (Wengert,  1964;  Rail,  1981; 
Griewank  and  Corliss,  1992;  Berz  et  al.,  1996;  Griewank,  2000).  The 
importance  of  the  subject  is  underscored  by  the  fact  that  the  1995 
Wilkinson  Prize  for  outstanding  contributions  to  the  field  of  numerical 
software  was  awarded  to  Chris  Bischof  (then  at  Argonne)  and  Alan 
Carle  (Rice)  for  the  development  of  the  FORTRAN  computational- 
differentiation  system  ADIFOR  2  (Bischof  et  al.,  1996).  As  discussed 
in  Sect.  4,  computational  divided  differencing  is  a  generalization  of 
computational  differentiation:  a  program  resulting  from  computational 
divided  differencing  can  be  used  to  obtain  derivatives  (as  well  as  divided 
differences) ,  whereas  a  program  resulting  from  computational  differen¬ 
tiation  can  only  produce  derivatives  (and  not  divided  differences). 

Other  Work  on  Accurate  Divided  Differencing 

The  program-transformation  techniques  for  creating  accurate  divided 
differences  described  in  this  paper  are  based  on  a  1964  result  of  Opitz’s  (Opitz, 
1964),  which  was  later  rediscovered  in  1980  by  McCurdy  (McCurdy, 

1980)  and  again  in  1998  by  one  of  us  (Reps).  However,  Opitz  and 
McCurdy  both  discuss  how  to  create  accurate  divided  differences  only 
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for  expressions.  In  this  paper,  the  idea  has  been  applied  to  the  creation 
of  accurate  divided  differences  for  functions  defined  by  programs. 

As  already  mentioned  in  Sect.  2,  the  computational-differentiation 
technique  that  was  summarized  there  is  what  is  known  as  forward-mode 
differentiation.  A  different  computational-differentiation  technique,  re¬ 
verse  mode  (Linnainmaa,  1976;  Speelpenning,  1980;  Iri,  1984;  Griewank, 
1989;  Griewank,  1991),  is  generally  preferable  when  the  number  of 
independent  variables  is  much  greater  than  the  number  of  dependent 
variables.  However,  although  it  is  possible  to  develop  a  reverse-mode 
version  of  computational  divided  differencing,  it  does  not  appear  to 
offer  the  same  potential  savings  in  operations  performed  that  reverse 
mode  achieves  for  computational  differentiation. 

McCurdy,  and  later  Kahan  and  Fateman  (Kahan  and  Fateman, 
1985)  and  Rail  and  Reps  (Rail  and  Reps,  2000),  have  looked  at  ways 
to  compute  accurate  divided  differences  for  library  functions  (i.e.,  sin, 
cos,  exp,  etc.). 

Kahan  and  Fateman  have  also  investigated  how  similar  techniques 
can  be  used  to  avoid  unsatisfactory  numerical  answers  when  evalu¬ 
ating  formulas  returned  by  symbolic-algebra  systems.  In  particular, 
their  work  was  motivated  by  the  observation  that  naive  evaluation  of  a 
definite  integral  f(x)dx  can  sometimes  produce  meaningless  answers: 
When  a  symbolic-algebra  system  produces  a  closed-form  solution  for 
the  indefinite  integral  f  f(x)dx ,  say  G(x)1  the  result  of  the  computation 
G(b)  —  G(a)  may  have  no  significant  digits.  Kahan  and  Fateman  show 
that  divided  differences  can  be  used  to  develop  accurate  numerical 
formulas  that  sidestep  this  problem. 

One  of  the  techniques  developed  by  McCurdy  for  computing  accu¬ 
rate  divided-difference  tables  involved  first  computing  just  the  first  row 
of  the  table  and  then  generating  the  rest  of  the  entries  by  a  backfill¬ 
ing  algorithm.  He  studied  the  conditions  under  which  this  technique 
maintained  sufficient  accuracy.  However,  his  algorithm  for  accurately 
computing  the  first  row  of  the  divided-difference  table  was  based  on 
a  series  expansion  of  the  function,  rather  than  a  divided-difference 
arithmetic,  such  as  the  FloatDDRl  arithmetic  developed  in  Sect.  6. 

Divided-difference  arithmetic  for  first  divided  differences  has  also 
been  called  slope  arithmetic ,  and  an  interval  version  of  it  has  been 
investigated  previously  as  a  way  to  obtain  an  interval  enclosure  for 
the  range  of  a  function  evaluated  over  an  interval  (Krawczyk  and  Neu- 
maier,  1985;  Zuhe  and  Wolfe,  1990;  Ratz,  1996).  It  has  been  shown 
that  interval  slope  arithmetic  can  yield  tighter  enclosures  than  methods 
based  on  derivatives  (Krawczyk  and  Neumaier,  1985;  Zuhe  and  Wolfe, 
1990;  Ratz,  1996).  Zuhe  and  Wolfe  have  also  shown  that  a  form  of 
interval  divided-difference  arithmetic  involving  second  divided  differ- 
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ences  can  provide  even  tighter  interval  enclosures  (Zuhe  and  Wolfe, 
1990).  In  the  present  paper,  we  have  confined  ourselves  to  point  (non¬ 
interval)  divided-difference  arithmetics,  but  have  explored  the  use  of  a 
divided-difference  arithmetic  for  divided  differences  of  arbitrary  order 
(originally  due  to  Opitz  (Opitz,  1964)).  We  have  shown  how  to  extend 
the  latter  divided-difference  arithmetic  to  the  case  of  multi-variate 
functions  involving  a  fixed,  but  arbitrary,  number  of  variables,  and 
have  also  proposed  various  specializations  of  it,  which  would  improve 
run-time  efficiency  in  certain  situations. 

Other  Work  on  Controlling  Round-Off  Error  in 
Numerical  Computations 

Computational  differentiation  and  computational  divided  differencing 
are  methods  for  controlling  round-off  error  that  can  arise  in  two  types 
of  numerical  computations.  An  extensive  collection  of  methods  for  con¬ 
trolling  round-off  error  for  a  wide  variety  of  numerical  computations  has 
been  developed  by  Kulisch’s  group  at  the  University  of  Karlsruhe  (Ham¬ 
mer  et  al.,  1993;  Hammer  et  al.,  1995). 

In  future  work,  we  plan  to  investigate  the  use  of  such  techniques 
to  achieve  greater  accuracy  (and  to  validate  results)  in  steps  of  the 
computational-divided-differencing  transformations  where  operations 
are  implemented  by  solving  systems  of  linear  equations,  such  as  division 
operations  (cf.  box  (5)). 

Other  Work  that  Exploits  Operator  Overloading 

The  operator-overloading  feature  of  C++  provides  a  convenient  mech¬ 
anism  for  implementing  the  divided-difference  arithmetics  that  are  de¬ 
scribed  in  this  paper.  Compared  to  other  methods  of  providing  poly¬ 
morphism  in  programming  languages,  such  as  parametric  polymorphism  (Mil¬ 
ner,  1978)  and  inheritance  polymorphism  (Cardelli,  1984),  operator 
overloading  has  always  been  something  of  a  poor  cousin.  In  the  programming- 
languages  community,  operator  overloading  is  sometimes  referred  to 
as  “ ad  hoc  polymorphism”  (Strachey,  2000),  a  term  that  has  pejo¬ 
rative  overtones.  Nevertheless,  in  addition  to  its  application  in  com¬ 
putational  differentiation  and  computational  divided  differencing,  op¬ 
erator  overloading  provides  a  convenient  implementation  mechanism 
for  a  wide  variety  of  other  interesting  applications,  including  partial 
evaluation  (Andersen,  1994),  safe  pointers  (Austin  et  al.,  1994),  and 
expression  templates  (Veldhuizen,  1995). 
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